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I.  INTRODUCTION 

This  document  is  an  interim  annual  report  on  the  research  accomplished 
under  the  sponsorship  of  Air  Force  Office  of  Scientific  Research  Grant  No. 
AFOSR  83-0160  during  the  time  period  March  18,  1083  through  May  17,  1684. 
The  research  performed  lies  in  five  major  areas:  (1)  Optical  interconnections;  (2) 
Nonlinear  optical  information  processing  using  four-wave  mixing;  (3)  Suppression 
of  speckle  in  coherently  formed  images;  (4)  Diagonalization  and  inversion  of  circu- 
lant  and  Toeplitz  matrices  using  coherent  optics;  and  (5)  Information  processing 
in  linear  birefringent  dispersive  materials.  We  summarize  the  progress  in  each  of 
these  areas  (Sections  0  through  VI).  In  addition  we  present  other  information 
pertinent  to  the  grant  in  the  final  section. 

D.  OPTICAL  INTERCONNECTIONS 

The  primary  and  most  important  project  under  way  is  an  investigation  of 
the  possible  applications  of  optics  to  electronic  interconnection  problems.  Inter¬ 
connections  are  among  the  most  challenging  problems  facing  the  electronics 
indust ry  today.  Optics  is  being  successfully  applied  to  the  problem  of  machine- 
to-uiachine  interconnections  (fiber-optic  local  area  networks),  but  there  are  inter¬ 
connection  problems  at  many  other  levels  of  electronic  architectures,  from  high¬ 
speed  buses  within  a  single  machine  to  board-to- board,  chip-to-chip,  and  even 
within-chip  communications.  The  most  difficult  problems  in  many  respects  are 
those  at  the  lowest  levels  of  architecture  (cbip-to-chip  and  within-  chip),  and  it  is 
at  these  levels  that  the  majority  of  our  work  is  focused.  The  uniqueness  of  our 
work  fit's  in  the  emphasis  placet!  on  'imaging'  interconnections,  for  which  a  con¬ 
nection  channel  is  established  by  means  of  a  holographic  imaging  element  that 
images  a  small  source  onto  one  or  more  small  detectors.  Work  is  aimed  at  both 
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tin*  conceptual  level  and  at  the  practical  level.  In  the  conceptual  area,  we  are 
attempting  to  discover  the  most  important  scenarios  in  which  optical  intercon¬ 
nections  make  sense.  A  good  summary  of  the  level  of  our  current  conceptual 
understanding  is  contained  in  the  preprint  of  the  paper  entitled  "Optical  inter¬ 
connections  in  microelectronics",  w'hich  is  attached  as  Appendix  1.  This  paper 
will  be  published  in  t  he  Proceedings  of  the  SPIE  volume  entitled  Optical  Comput¬ 
ing:  a  Critical  Review  of  Technology. 

At  the  practical  level,  we  have  embarked  upon  two  different  tasks.  One  is  to 
evaluate  the  use  of  optical  interconnections  as  a  means  for  communication 
between  chips.  The  other,  just  getting  under  way,  b  an  investigation  of  clock 
distribution  within  a  single  chip  using  optics.  Since  the  latter  project  is  just  start¬ 
ing.  we  concentrate  our  comments  here  on  the  progress  in  the  former  area. 

Communication  between  distant  locations  (i.e.  several  mm  to  a  few  cm)  on 
planar  substrates  b  a  limiting  factor  of  VLSI  system  performance.  The  high  den¬ 
sity  of  input  and  output  terminab  of  integrated  subunits  (CPU’s,  buffers.  ROM’s, 
etc.)  pn*scnt  serious  pad  bonding  nnd  signal  drive  power  difficulties.  In  addition, 
the  point-t«>-|»oint  communications  restrictions  imposed  by  conventional  elec¬ 
tronic  interconnection  schemes  oncombers  algorithm  design. 

The  particular  interconnect  problem  addressed  in  this  work  is  described  as 
follows.  Two  electronic  integrated  circuit  units  fabricated  on  a  substrate  with  an 
array  of  input  /output  bonding  pads  are  linked  by  deposited  aluminum  or  a 
hybridized  conducting  material.  The  bonding  p.ids  are  typically  125  micrometers 
square.  The  average  power  required  to  send  a  signal  over  a  terminated  line  in 
this  situation  is  about  <10  milliwatts  for  a  2  volt  signal  with  a  50  ohm  termina¬ 
tion.  To  date  the  highest  modulation  rates  for  arrangements  of  this  type  are 


about  100  MHz.  Projected  required  transmission  modulation  rates  are  on  the 
order  of  tens  of  GHz.  Our  goal  is  to  investigate  the  feasibility  of  an  optical 
replacement  for  the  interconnection  just  described. 

The  optical  version  of  the  interconnect  geometry  is  described  as  follows.  An 
information-bearing  signal  on  one  chip  drives  a  light  source;  the  emitted  flux  is 
collected  by  an  imaging  element  and  imaged  onto  a  detector  on  the  second  chip, 
where  it  is  converted  to  an  electronic  signal.  Information  transfer  between  the 
two  chips  thus  takes  place  optically,  rather  than  electronically. 

An  experimental  system  for  testing  the  concept  was  constructed.  It  uses  a 
Litronix  R513  standard  red  LED  with  peak  emission  wavelength  660  nm  in  a  20 
nm  bandwidth.  Approximately  70  mw  of  electrical  power  is  required  to  produce 
an  optical  intensity  of  60  microwatts  per  steradian.  The  detectors  are  Hewlett- 
Packard  photodiodes  from  a  high-speed  opto-coupler.  The  active  area  of  the 
photodiode  is  about  280  micrometers  square,  while  the  area  of  the  emitter  is 
about  2i0  micrometers  square.  Imaging  is  achieved  with  a  reflection  volume 
phase  hologram  formed  on  an  Agfn-Gnevort  8K7S  HI)  emulsion.  A  pyrogallol- 
sodium  carbonate  developer  (Agfa  #GP-62)  and  a  potassium  bromide  p- 
bonzoqiiinonc  bleach  (G P-432)  were  used.  The  holographic  element  was  formed 
with  on-axis  converging  and  diverging  spherical  beams  from  a  HeNe  laser.  The 
source  and  image  points  were  about  <1.5  cm  from  emulsion  plane  producing  a 
beam  overlap  region  of  about  1.5  cm  diameter. 

When  illuminated  with  the  660  nm  LED,  the  peak  hologram  diffraction 
efficiency  was  about  5.5%.  Image  irradiance  and  resolution  were  at  off-axis 
illumination  angles  were  evaluated  with  a  CCD  line  scanner.  Image  resolution  at 
1.0  cm  varied  from  from  270  micrometers  at  7  mm  source-detector  separation  to 


330  micrometers  at  30  mm  separation.  Astigmatism  was  the  predominant  aberra¬ 
tion  at  large  angles  of  incidence.  At  an  18  degree  angle  of  incidence,  image  inten¬ 
sity  decreased  to  50%  of  the  value  at  a  0  degree  angle  of  incidence.  Thus  18 
degrees  is  considered  the  useful  field-of-view  of  this  hologram. 

Two  source-detector  mounting  geometrit  were  tested.  In  the  first,  an  LED 
and  detector  diode  were  separated  by  00  micr«  ters  distance  corresponding  to 
typical  bonding  pad  separations  on  an  IC.  The  purpose  of  this  geometry  was  to 
evaluate  crosstalk  between  a  source  and  a  nearby  detector  on  a  single  chip.  A 
second  test  configuration  with  a  source-detector  separation  of  1.1  mm  was  also 
used  to  simulate  an  interconnection  between  two  different  1C  units.  Unwanted 
scattered  light  was  found  to  be  sufficiently  low  in  level  to  not  pose  a  difficulty  for 
the  system.  The  limits  to  system  performance  posed  by  internal  receiver  noise 
are  now  being  evaluated.  A  major  effort  in  the  future  must  be  invested  in  the 
design  and  realization  of  low-noise  receiver  circuitry  suitable  for  realization  on 
integrated  circuit  chips. 

The  work  at  Stanford  on  optical  interconnections  and  supported  by  AFOSR 
ha*  received  considerable  recognition  in  the  scientific  community.  As  mentioned 
previously,  an  invited  paper  was  presented  at  the  SPIE  critical  review  of  optical 
computing  technology.  An  invited  paper  will  be  published  in  the  July  1084  issue 
of  the  Proceeding  of  the  IEEE.  The  keynote  address  of  the  13th  General  Meet¬ 
ing  of  the  International  Commission  for  Optics  will  be  presented  on  this  subject 
by  Prof.  Goodman  in  August  1084,  and  the  first  R.V.  Pole  Memorial  Lecture  will 
be  presented  on  this  topic  by  Prof.  Goodman  at  the  1084  CLEO  meeting  in  June 


HI.  NONLINEAR  INFORMATION  PROCESSING  USING  PHO- 
TO REFRACTIVE  MATERIALS 


Pbotorefractive  materials  are  of  interest  in  image  processing  because  of  their 
holographic  analog  to  conventional  Fourier  optical  processing  systems.  Other 
groups  have  reported  using  two-wave  and  four-wave  mixing  techniques  in  pho- 
torefractive  crystals  to  perform  optical  phase  conjugation,  image  convolution  and 
correlation,  edge  enhancement,  image  amplification,  and  image  division.  We  are 
interested  in  image  inversion  and  have  been  investigating  the  characteristics  of 
photorefraetives  which  generate  and  limit  the  inversion  process. 

The  origin  of  our  interest  in  image  inversion  using  photorefractive  materials 
rests  oil  i lie  iiuiisii:il  signal  processing  operations  that  ran  be  performed  with  such 
an  elTcet.  Image  dcldurring.  code  translation,  detection  of  non-periodic  struc¬ 
tures  in  a  periodic  field,  detection  of  periodic  structures  in  a  non-periodic  field, 
and  diagonalization  and  inversion  of  circulant  matrices  (see  section  V)  are  all 
examples  of  operations  that  can  be  performed  if  a  method  for  image  inversion,  or 
more  properly,  wavefront  inversion,  is  realized. 

In  a  phot oic fractivc  medium,  a  space-charge  field  is  generated  when  trapped 
charges  are  excited  by  an  intensity-interference  pattern  and  subsequently 
ret  rapped  after  transport  by  drift  and  diffusion.  This  field  is  proportional  to  the 
modulation  depth,  m  ,  of  the  intensity  grating,  unless  m  approaches  unity.  If  /, 
represents  the  intensity  of  the  object  beam  and  I,  represents  the  intensity  of  the 
reference  beam,  then 
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Through  the  electro-optic  effect,  the  change  in  refractive  index  has  the  same 
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dependence  as  the  space-charge  field.  A  third  beam  performs  read-out  of  the 
grating  in  real-time.  When  plane  waves  are  used  as  the  input  beams,  the  result¬ 
ing  steady-state  expression  for  diffraction  efficiency  is 
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The  exptessioii  for  diffraction  efficiency  can,  for  small  argument,  be  approximated 
by  the  square  of  the  argument;  thua,  diffraction  efficiency  i$  proportional  to  the 
square  of  the  modulation  depth.  Because  m  depends  only  on  relative  intensities 
between  the  object  beam  and  the  reference  beem,  and  not  on  absolute  intensity 
of  either  beam,  variation  of  the  beam  ratio  away  from  unity  causes  diffraction 
efficiency  to  drop.  At  appropriate  beam  ratios,  therefore,  image  inversion  can 
occur.  In  certain  cases,  it  n  appropriate  to  use  an  effective  modulation  depth 
which  includes  effects  of  absorption  of  the  crystal,  «,  and  total  light  intensity 
incident  on  the  crystal,  lT  .  Here, 


1+ 


oh 

and  ,i  is  dependent  on  crystal  parameters. 

Our  experimental  >et-up  uses  Argon  laser  green  light  to  write  a  grating  in 
HSO  or  I JOO.  and  a  He-Nc  laser  is  used  to  read  it  out.  Minor  adjustments  need 
to  he  made  to  the  formulas  to  account  for  a  different  read-out  wavelength.  We 
compare  the  experimental  data  with  theoretical  rah  ulat ions.  We  have  been 
exploring  the  parameters  which  affect  the  dynamic  range  of  the  beam  ratio  over 
which  inversion  occurs.  In  simulating  the  tu'o-bcam  coupling  process,  we  have 
shown  that  varying  the  electro-optic  coefficient  changes  the  diffraction  efficiency 
at  a  constant  beam  ratio  as  well  as  the  range  of  the  beam  ratio  where  inversion  is 
seen.  The  concentration  of  donors  and  traps  is  another  crystal  parameter  which 
affects  diffraction  efficiency  and  invention.  When  keeping  all  crystal  parameters 
fixed,  inversion  is  influenced  by  feature*  of  the  experimental  set-up.  In  particu¬ 
lar.  it  appear*  to  be  pomihlc  In  ‘•lune’’  the  inversion  dynamic  range  by  varying 
an  applied  electric  field  across  the  crystal.  Total  light  intensity  reaching  the  co¬ 
stal  may  be  another  tuning  parameter.  Once  the  two-beam  process  is  well  under¬ 
stood.  we  plan  to  use  four-wave  mixing  techniques  to  invert  images. 

IV.  SUPPRESSION  OF  SPECKLE  IN  COHERENTLY  FORMED 
IMAGES 

All  coherently  formed  images,  including  those  obtained  form  side-looking 
radar  systems,  laser-illuminated  imaging  systems,  and  medical  ultrasound,  con¬ 
tain  a  disturbing  noiae  known  aa  speckle.  Speckle  arises  whenever  the  coherence 
length  of  the  illuminating  radiation  is  long  compared  hi  the  surface  roughness  of 
the  object  to  be  imaged.  For  some  time  we  have  been  involved  in  an  evaluation 


of  a  wide  variety  of  techniques  for  suppressing  speckle  by  means  of  digital  image 
processing.  Our  work  provides  the  first  quantitative  comparison  of  these  diverse 
methods,  and  also  introduces  some  new  techniques  that  have  never  been  tried 
before. 

Oxer  the  past  20  years  or  so,  many  researchers  have  tried  many  different 
methods  for  suppressing  speckle  by  post- detect  ion  filtering.  Popular  methods 
include  (I)  simple  smoothing  of  the  picture  with  linear  filters;  (2)  homomorphic 
filtering  consisting  of  a  logarithmic  transformation,  followed  by  linear  filtering, 
followed  by  an  exponential  transformation;  and  (3)  two-dimensional  median 
filtering.  Our  work  has  compared  all  of  these  methods  for  certain  objects  with 
mean-sqiiare  error  ns  a  quality  criterion.  In  addition,  we  have  introduced  new 
method*  which  include;  (1)  nonlinear  one-dimensional  filtering  in  a  projection 
space;  and  a  maximum  entropy  method. 

The  results  of  the  work  have  identified  what  we  believe  to  be  the  most 
promising  approaches.  Of  the  methods  studied,  most  consistently  good  results 
were  obtained  using  one  dimensional  square  root  filtering  (i.e.  a  square  root  non¬ 
linearity.  followed  by  linear  filtering,  followed  by  a  square-law  transformation)  in 
a  projection  space. 

Three  papers  have  been  prepared  on  the  results  of  this  work,  and  effective 
June  lfe<d.  the  project  has  been  terminated.  The  papers  will  he  submitted  to 
various  journals  in  the  near  future. 

V.  DIAGONALIZATION  AND  INVERSION  OF  CIRC  UL  ANT 
MATRICES  BY  COHERENT  OPTICS 

For  some  time  we  have  been  studying,  both  theoretically  and  experimen- 


tally,  tome  method'  f or  diagonalizing  and  inverting  circulant  matrices  using 
coherent  optics.  Inversion  of  circulant  approx iinatious  to  Toeplitr  matrices  is  « 
prohh-m  that  atisi*s  frequently  in  optimal  tillering  problems.  1‘sing  the  methods 
\ve  hat e  been  studying,  it  is  |tossjldc  to  invert  as  many  as  several  hundred  circu- 
la n t  matrices,  each  of  dimension  several  hundred  bv  several  hundred,  simultane¬ 
ously  on  one  pass  through  a  coherent  optical  system.  Such  a  capability  may  have 
application  to  estimation  problems  involving  antenna  arrays  with  many  elements, 
with  many  resolvable  frequency  bands  in  each  antenna  lobe. 

The  basic  idea  behind  this  work  has  recently  been  published  in  Applied 
Oplie*.  (lather  than  repeat  that  material  here,  we  have  attached  a  reprint  of  the 
paper  as  Appendix  2.  Since  the  publication  of  that  paper,  we  have  pushed  the 
work  further,  successfully  inverting  circulant  matrices,  using  photographic  film  as 
the  nonlinear  element  needed  to  accomplish  such  inversion.  Cliimatrly  interest 
rests  in  the  use  of  real-time  nonlinear  elements  for  accomplishing  this  inversion. 
Indeed  the  work  on  wavefront  inversion  using  pb*»torefractive  crystals  described 
in  Section  III  is  directly  applicable  to  this  problem. 

A  paper  i<  now  being  written  describing  the  further  experiments  that 
resulted  in  actual  matrix  inversions.  This  paper  will  complete  our  work  on  this 
subject,  and  the  project  will  be  terminated  at  the  end  of  June  IW4. 

VI.  INFORMATION  PROCESSING  IN  LINEAR  BIREFRINGENT 
DISPERSIVE  MATERIALS 

An  area  of  recent  interest  to  us  has  been  the  study  of  optical  propagation  in 
linear,  bi  ref r  in  gent,  dispersive  media,  and  the  possibility  of  performing  useful  sig¬ 
nal  processing  operations  with  devices  based  on  such  media.  It  has  been  shown 
through  this  work  that  in  the  process  of  propagation  of  a  space-time  wavefield  in 


siM-h  a  medium.  a  rrtK*-amhiguity  function  of  the  spatial  and  temporal  part**  of 
tin*  input  distribution  is  generated.  provided  the  di»persion  relation**  of  the 
iiii'diuui  ex liihii  a  certain  coupling  bet h ecu  tent|M»ral  dispersion  and  spatial  aniso¬ 
tropy  *1  In**  new  result  opens  new  avenues  for  novel  signal  processing  metht»ds  in 
the  douii.in  of  nllrnfnst  optics. 

E»*ut  tally.  the  quality  to  be  possessed  by  materials  which  generate  the 
ambiguity  function  is  the  following:  axial  dispersion  of  temporal  wavepackets 
varies  with  direction  of  propagation  within  the  crystal.  Thus  the  group  velocity 
b  anisotropic  and  the  materials  exhibit  anisotropic  dispersion.  An  alternative 
description  is  that  the  Poynfing  vector  walkoff  angle  for  the  required  materials 
exhibits  dispersive  anisotropy,  in  the  sense  that  its  spatial  anisotropy  characteris¬ 
tics  chnnge  with  temporal  frequency. 

To  pursue  the  idea  to  the  practical  implementation  of  an  optical  processor, 
much  work  remains  to  be  done  in  identifying  suitable  crystals  with  dispersive 
anisotropy  (or  anisotropic  dispersion)  compatible  with  the  requisite  time  resolu¬ 
tion.  The  significance  of  the  current  results  for  the  domain  of  ultrafast  optics 
remains  to  be  evaluated.  A  particularly  attractive  prospect  consists  of  converting 
the  convolutional  ambiguity  function  processor  into  a  temporal  time-invariant 
filter  which  could  compute  convolutions  with  an  impulse  response  that  b  spatially 
preset  In  means  of  a  spatial  modulator. 

VO.  MISCELLANEOUS  INFORMATION 

This  section  contains  miscellaneous  information  regarding  the  personnel  and 
publications  associated  with  the  AFOSR  grant.  The  following  individuals  contri¬ 
buted  to  the  research  output  of  the  grant: 
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Abstract 

As  ths  complexity  of  microelectronic  circuits  Increases,  per fores nee  becomes  more  and 
more  limited  by  interconnections.  Continued  scaling  and  packing  lead  to  a  dominance  of 
interconnect  delays  over  gate  delays.  This  paper  explores  the  potential  of  optical  inter¬ 
connections  as  a  mean  for  alleviating  such  limitations.  Various  optical  approaches  to  the 
problem  are  discussed,  including  the  use  of  guided  waves  (Integrated  optics  and  fiber 
optics)  and  free  space  propagation  (simple  broadcast  and  imaging  interconnections) .  The 
utility  of  optics  is  influenced  by  the  nature  of  the  algorithms  that  are  being  carried  out 
in  computations.  Certain  algorithms  make  far  greater  demands  on  interconnections  than  do 
others.  Clock  distribution  is  a  specific  application  where  optics  can  make  an  immediate 
contribution.  Data  interconnections  are  more  demanding,  and  require  the  development  of 
hybrid  Si/CaAs  devices  and/or  heteroepltaxlal  structures  containing  both  si  and  GaAs  layers. 
The  possibilities  for  future  developments  in  this  area  are  discussed. 

i*  Introduction  _ 

In  the  field  of  computation,  optios  is  currently  best  known  for  its  role  in  analog  sig¬ 
nal  processing.  Examples  include  acousto-optic  spectrum  analysers,  convolvers  and  correla¬ 
tors  (1),  (3),  and  systems  for  forming  images  from  synthetic-aperture  radar  data  (3),  (4). 
Analog  approaches  of  this  kind  offer  high  processing  speed,  but  low  accuracy  and  limited 
flexibility  in  terms  of  the  variety  of  operations  that  can  be  performed.  These  shortcomings 
have  lead  to  a  search  for  applications  of  optics  to  digital  (9),  [6]  and  other  types  of 
high-accuracy  numerical  computation  (7],  (•).  However,  regardless  of  the  outcome  of  the  new 
thrusts  towards  digital-optical  computation,  it  is  virtually  certain  that  the  vast  majority 
of  computation  done  in  the  world  for  many  years  to  come  will  be  performed  by  microelectronic 
chips.  While  current  ideas  in  digital-optical  computing  have  the  potential  to  strongly 
impact  special-purpose  digital  signal  processing,  they  are  very  unlikely  to  invade  a  signi¬ 
ficant  fraction  of  the  total  computing  market  in  the  foreseeable  future,  xt  is  therefore 
natural  to  inquire  as  to  whether  there  exists  another  role  for  optics  in  digital  computing, 
one  that  has  the  potential  for  greater  impact  on  the  overall  computing  market.  .  _ 

A  digital  computer  or  computational  unit  consists  primarily  of  nonlinear  devices  (logic 
gates)  in  which  input  signals  interact  to  produce  octput  signals,  and  interconnections 
between  such  devices  or  groups  of  devices  of  various  sises  and  complexity.  The  nonlinear 
interactions  required  of  individual  computational  elements  are  realised  in  optics  only  with 
considerable  difficulty,  various  kinds  of  optical  light  valves  have  been  utilised  to  real¬ 
ise  a  multitude  of  parallel  nonlinear  elements  (9),  (10),  but  the  speeds  at  which  such  dev¬ 
ices  can  operate  are  exceedingly  slow  by  comparison  with  equivalent  electronic  elements. 
Recent  discoveries  in  the  area  of  optical  bistability  have  generated  new  interest  in  con¬ 
structing  optical  logic  gates  that  are  even  faster  than  their  electronic  counterparts,  but 
currently  the  efficiency  of  such  devices  is  low  and  the  device  concepts  are  too  little 
explored  to  allow  a  full  assessment  of  their  potential.  The  construction  of  optical  logic 
gates  with  speeds,  densities  and  efficiencies  equaling  or  exceeding  those  of  electronic 
gates  remains  problematical,  although  future  progress  is  certainly  possible. 

while  optics  lags  behind  electronics  in  the  realisation  of  needed  nonlinear  elements, 
nonetheless  the  borison  for  electronics  is  not  without  clouds.  Xt  is  generally  realised  and 
agreed  that  the  exponential  growth  of  semiconductor  chip  capabilities  can  not  continue  inde¬ 
finitely,  and  indeed  important  limits  are  beginning  to  be  felt  already.  These  limits  arise 
not  from  difficulties  associated  with  the  further  reduction  of  gate  areas  and  delays,  but 
rather  from  the  difficulties  associated  with  Interconnections  as  dimensions  are  further 
scaled  down  and  chip  areas  continue  to  Increase  (11),  (12).  Indeed,  interconnection  diffi¬ 
culties  extend  beyond  the  chip  level,  and  have  major  impact  at  the  board  level  as  well  (131.. 

Given  the  above  facts,  it  seems  natural  to  Inquire  as  to  whether  optics  might  offer, 
important  capabilities  in  overooming  the  interconnect  problems  associated  with  microelec¬ 
tronic  circuits  and  systems.  Encouragement  is  offered  by  the  fact  that  the  very  property  of 
optics  that  makes  it  difficult  to  realise  nonlinear  elements  (it  is  difficult  to  make  two 
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•creams  of  photons  interact)  is  precisely  the  property  desired  of  an  interconnect  technol- 
ogy.  However ,  at  the  chip  level ,  optics  is  likely  to  be  only  one  element  of  a  hierarchy  of 
interconnect  technologies,  with  optical  interconnection  networks  feeding  more  conventional 
interconnect  lines  constructed  from  polycrystalline  silicon,  metal  silicides,  or  metals. 

There  exists  an  entire  hierarchy  of  interconnect  problems  for  which  the  implications  of 
optics  should  be  considered.  At  one  extreme  is  the  problem  of  machine-to-machine  communica¬ 
tion  (local  area  networks).  Such  problems  are  excluded  from  our  consideration  here,  due  to 
the  fact  that  so  much  work  is  already  in  progress  by  others.  The  next  level  is  subsystem  to 
subsystem  communication  within  a  single  computer.  Much  less  work  has  been  done  on  optical 
approaches  to  this  problem,  but  at  least  one  example  exists  (14).  Again  we  exclude  such 
problems  from  consideration  here,  preferring  to  concentrate  on  communication  problems  at 
lower  levels  of  machine  structure.  Board-to-board  communication  with  optics  can  be  thought 
of  as  the  problem  of  constructing  an  optical  data  bus  within  a  machine.  Some  ideas  pertinent 
to  this  level  of  interconnection  have  been  published  (see,  for  example,  Ref.  (IS)),  but 
again  we  exclude  these  problems  from  consideration.  Interconnection  within  a  single  board 
(dimensions  of  approximately  1  ft  x  1  ft)  is  in  the  realm  where  our  discussions  are  per¬ 
tinent,  as  is  likewise  the  problem  of  interconnection  within  a  wafer  (typical  dimension  7.S 
cm  diameter)  or  within  a  single  chip  (typical  dimensions  10  mm  x  10  mm).  Most  of  our  dis¬ 
cussions  will  refer  to  the  chip-level  problem,  but  many  of  the  ideas  can  be  extrapolated  to 
the  wafer  and  board  levels.  In  addition,  most  of  the  discussion  will  assume  that  we  are 
dealing  with  common  metal-oxide-seaiconductor  (MOS)  integrated  circuit  technology. 

The  purpose  of  this  paper  is  to  stimulate  further  thought  and  research  on  optical 
interconnections.  The  ideas  presented  are  rather  preliminary  and  underdeveloped,  but  hope¬ 
fully  they  have  the  potential  to  serve  as  a  starting  point  for  others  in  considering  how 
optics  might  play  a  more  widespread  role  in  computing  of  the  future.  Section  II  reviews  the 
origins  and  nature  of  the  electronic  Interconnect  problem  from  the  technological  polnt-of- 
view.  Section  III  briefly  discusses  the  interaction  between  interconnections  and  algorithmic 
considerations,  thus  examining  the  algorithmic  side  of  the  motivation  for  optical  intercon¬ 
nects.  Section  VI  discusses  the  potential  benefits  that  optics  might  bring  to  a  solution  to 
these  problems  and  outlines  two  fundamentally  different  optical  approaches  to  interconnec¬ 
tions,  the  index-guided  approach  and  the  free-space  propagation  approach.  Section  V  specifi¬ 
cally  addresses  the  problem  of  clock  distribution,  while  Section  VI  deals  with  the  problem 
of  data  distribution.  Finally,  Section  VII  discusses  future  developments  needed  if  these 
ideas  are  to  have  practical  impact. 

II.  The  Interconnect  Problem  -  Technological  Motivation 

The  growth  of  integrated  circuit  complexity  and  capabilities  experienced  since  the 
birth  of  the  Industry  has  been  achieved  through  a  combination  of  scaling  down  the  minimum 
feature  size  achievable,  and  a  scaling  up  of  the  maximum  chip  size,  both  subject  to  the  con¬ 
straint  of  reasonable  yield.  The  scaling  process  has  many  beneficial  effects,  but  also 
eventually  causes  difficulties  if  combined  with  packing,  i.e.  the  addition  of  circuitry  in 
order  to  realize  more  complex  chips  in  the  same  area  of  silicon  that  was  used  before  scal¬ 
ing.  Here  we  briefly  discuss  the  good  and  bad  effects  of  scaling.  A  more  detailed  discus¬ 
sion  of  the  subject  is  found  in  Ref.  (11). 

Assume  that  all  the  dimensions,  as  well  as  the  voltages  and  currents  on  the  chip,  are 
scaled  down  by  a  factor  d  (an  d  greater  than  one  implies  that  the  sizes  and  levels  are 
shrinking).  Consider  first  the  effects  of  scaling  on  device  performance.  Obviously,  when 
scaling  down  the  linear  dimensions  of  a  transistor  by  d,  the  number  of  transistors  that  can 
be  placed  on  a  chip  of  given  size  scales  up  as  d  •  In  addition,  the  power  dissipation  per 
transistor  decreases  by  a  factor  d,  due  to  the  fact  that  both  the  threshold  voltage  and  the 
supply  voltage  are  scaled  down  by  d.  Finally,  we  note  that  the  switching  delay  of  a 
transistor  is  scaled  down  by  d,  due  to  the  fact  that  the  channel  length  is  decreased  by  that 
factor. 

Scaling  also  affects  the  interconnections  between  devices.  Figure  1  shows  the  effect 
of  scaling  down  a  conductor  by  a  factor  d.  Since  the  cross-sectional  area  of  the  conductor 
is  decreased  by  a  factor  d  »  the  resistance  per  unit  length  will  increase  by  a  similar  fac¬ 
tor.  If  the  length  of  the  conductor  is  scaled  down  by  d,  as  simple  scaling  implies,  then 
the  net  increase  of  resistance  is  proportional  to  d.  At  the  same  time  scaling  Implies 
changes  of  the  capacitance  of  the  interconnection.  Regarding  the  conductor  as  one  plate  of 
a  parallel  plate  capacitor,  ecaling-down  of  both  linear  dimensions  of  the  plate  by  d  implies 
a  decrease  of  capacitance  by  d  •  However,  scaling  also  Implies  a  decrease  by  d  of  the 
thickness  of  the  oxide  Insulating  layer  separating  the  plates  of  the  capacitor.  Hence  the 
capacitance  is  inversely  proportional  to  the  first  power  of  the  scaling  constant  d.  We  see 
that  the  scaling  up  of  resistance  and  the  scaling  down  of  capacitance  exactly  cancel,  leav¬ 
ing  the  RC  time  constant  unchanged. 

Since  gate  delays  scale  down  with  d  while  interconnect  delays  remain  independent  of  d* 


it  is  clear  that  eventually  a  point  must  be  reached  where  interconnection  delays  dominate 
device  delays.  However,  the  situation  is  actually  much  worse  than  the  above  considerations 
imply,  due  to  the  fact  that  packing  usually  accompanies  scaling,  and  the  lengths  of  the 
interconnects  required  do  not  scale  down  with  4.  Rather,  as  the  complexity  of  the  circuit 
being  realized  increases,  the  distances  over  which  the  interconnections  must  be  maintained 
on  a  chip  of  fixed  area  stay  roughly  constant.  Statistical  considerations  show  [13]  that  a 
good  approximation  to  the  maximum  length  Lmax  of  the  interconnection  required  is  given 
approximately  by 


L 


max 


A1/2/2 
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where  A  represents  the  area  of  the  chip.  If  the  area  of  the  chip  stays  roughly  constant, 
then  the  maximum  interconnect  length  stays  roughly  constant,  interconnection  resistance 
scales  up  as  4  ,  and  interconnection  capacitance  stays  constants  yielding  an  overall  scaling 
of  interconnect  delay  that  increases  in  proportion  to  c (  .  Note  that  if  chip  size  is 
increased,  rather  than  remaining  constant,  the  interconnection  problem  becomes  further  exa¬ 
cerbated.  As  a  consequence  of  these  considerations,  it  has  been  estimated  that  by  the  late 
1980's,  MOS  chip  speeds  will  be  limited  primarily  by  interconnect  delays  [11]. 

Another  different  aspect  of  scaling  is  of  considerable  importance  here.  We  refer  to 
the  effects  of  scaling  and  packing  on  the  numbers  of  connections  required  from  the  outside 
world  to  chips.  As  the  number  of  elements  in  a  single  chip  grows,  the  number  of  interconnec¬ 
tions  required  from  that  chip  to  other  chips  also  increases.  There  is  a  well-known  empiri¬ 
cal  relation,  known  as  Rentes  rule,  which  specifies  that  the  number  of  interconnections  M 
required  for  a  chip  consisting  of  N  devices  grows  as  approximately  the  0.61  power  of  N,  i.e. 

M  =  N0,61  (2) 


However,  the  perimeter  of  a  chip,  to  which  the  connections  must  be  made,  grows  as  only  the 
square  root  of  area,  or  equivalently  as  the  0.5  power  of  N.  The  disparity  caused  by  the 
difference  of  these  two  exponents  becomes  more  and  more  important  as  the  number  of  devices 
within  a  chip  grows  due  to  scaling  and  packing.  It  should  be  noted  that  Rent's  rule  applies 
only  to  chips  consisting  of  logic  elements.  Memory  cells  require  fewer  interconnections, 
in  addition,  it  is  required  that  each  chip  be  a  small  "random"  subset  of  the  entire  logic 
system  [13]. 

The  predictions  of  Rent's  rule  can  also  be  applied  to  collections  of  chips  on  a  board 
(providing  the  assumptions  mentioned  above  are  satisfied) .  At  the  chip  level,  some  of  the 
limitations  implied  by  Rent's  rule  can  be  overcome  by  the  use  of  metal  bump  technology  for 
making  interconnections  possible  from  the  interior  of  a  chip,  rather  than  just  from  the 
edges.  Optical  techniques  may  ultimately  provide  an  alternate  and  more  flexible  means  for 
providing  interconnections  directly  to  the  interior  of  a  chip. 

One  additional  and  final  implication  of  scaling  should  be  mentioned.  While  current 
scales  down  inversely  with  4,  the  cross-sectional  area  through  which  that  current  must  flow 
scales  down  inversely  with  4  .  The  net  result  is  that  current  density  Increases  in  propor¬ 
tion  to  4.  Such  an  increase  leads  to  increasing  effects  of  electromiqratlon.  by  which  is 
meant  the  movement  of  conductor  atoms  under  the  influence  of  electron bombardment.  The 
ultimate  effect  of  electromigration  is  the  breaking  of  conductor  lines  and  the  failure  of 
the  chip.  The  potential  of  optical  interconnections  as  a  means  for  alleviating  electromi¬ 
gration  problems  is  of  considerable  interest  here.  ...  . 

III.  The  Interconnect  Problem  -  Algorithmic  Motivation 

In  Section  II,  some  of  the  technological  motivation  for  considering  optical  intercon¬ 
nections  was  presented.  In  this  section  we  discuss  motivations  that  arise  from  algorithmic 
considerations.  We  describe  several  classes  of  problems  that  place  various  degrees  of  bur¬ 
den  on  interconnections.  Examples  are  drawn  primarily  from  the  fields  of  signal  processing 
and  image  processing.  ....  — 

The  first  class  of  operations  considered  places  the  smallest  burden  on  interconnec¬ 
tions.  We  refer  to  this  class  as  point  processing  operations.  Given  a  two  dimensional 
array  g(n,m)  of  data  points  to  be  processed- and  a  two  dimensional  array  of  processors  to 
perform  these  operations,  the  result  of  the  processing  is  described  by  _  .  .  .. 


g(n,m)  ■  NL[f(n,m)]  (3) 


where  NL[)  is  a  (generally  nonlinear)  function  that  transforms  each  value  f(n,m)  into  a  new 
value  g(n,m),  independent  of  the  values  of  f(n,m)  at  indices  other  than  (n,m).  Since  each 
g(n.m)  depends  only  on  one  corresponding  f(n,m),  no  communications  are  required  between  pro¬ 
cessors  in  the  array.  Because  the  interconnections  are  relatively  simple  for  such  problems, 
it  is  likely  that  the  only  role  for  optics  here  might  be  in  loading  and  unloading  the  data 
to  and  from  the  processor  array.  If  the  processor  array  is  a  large  one,  the  number  of  paral¬ 
lel  inputs  and  parallel  outputs  needed  for  maximum  efficiency  will  be  large,  implying  a 
large  pin  count  for  th°  processing  chip.  Relief  can  be  found  by  integrating  detectors  with 
each  processor,  thus  p  viding  a  parallel  set  of  optical  input  channels.  To  achieve  similar 
relief  for  output  data,  the  more  difficult  problem  of  integrating  a  source  with  each  proces¬ 
sor  must  be  solved.  Potential  paths  to  solutions  of  this  problem  are  discussed  in  later 
sections. 

As  a  second  level  of  interconnect  difficulties,  consider  the  problem  of  matrix-matrix 
multiplication.  Letting  capital  letters  A,B,  and  C  represent  matrices,  and  lower-case 
letters  with  subscripts  represent  particular  elements  of  those  matrices,  we  wish  to  consider 
the  computation  described  by 


C  -  A  B 


(4) 


or 
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Note  that  a  processor  computing  c,,  requires  communication  from  one  row  of  A  and  one  column 
of  B.  Thus  we  might  say  that  the1eommunications  required  are  semi-global  (fully  global  com¬ 
munications  would  require  that  each  c,.,  receive  data  from  all  elements  of  A  and  all  elements 
of  B,  which  is  not  the  case  here) .  Ifl3spite  of  the  fact  that  the  communication  requirements 
are  semi-global,  nonetheless  it  has  been  shown  [16)  that  the  computation  can  be  performed  by 
a  two  dimensional  array  of  processors,  with  each  processor  connected  only  to  its  nearest 
neighbors  (we  refer  to  such  processors  as  being  "mesh  connected") .  If  only  a  single  matrix 
product  is  to  be  performed,  then  the  time  required  by  the  mesh-connected  set  of  processors 
is  greater  than  would  be  the  case  if  the  processors  were  connected  with  semi-global  communi¬ 
cations.  However,  the  mesh-connected  array  can  readily  be  pipelined,  and  if  a  large  number 
of  matrix-matrix  oroducts  are  to  be  performed  in  succession,  the  pipelined  mesh-connected 
array  can  do  the  job  in  the  same  time  that  a  set  of  semi-globally  connected  processors  could 
achieve.  We  conclude  that  for  the  matrix-matrix  multiplication  problem,  the  roles  for 
optics  are  probably  limited  to  providing  data  loading  and  unloading,  and  providing  semi- 
global  communications  when  the  constraints  of  the  problem  prevent  pipelining. 

It  is  well  recognized  in  the  computer  science  field  that  there  are  classes  of  algo¬ 
rithms  that  are  intensive  in  their  requirements  for  interconnects  [17),  [18).  Such  algo¬ 
rithms  generally  require  periodic  exchange  of  data  between  processors  in  an  array,  with  the 
exchanges  being  most  efficiently  carried  out  with  other  than  nearest-neighbor  interconnec¬ 
tions.  Often  the  exchange  pattern  has  a  particular  structure  that  allows  the  exchanges  to 
be  carried  out  with  an  interconnect  network  that  is  not  fully  general.  Examples  of  useful 
exchange  networks  include  the  shuffle-exchange  network  (Tig.  2),  and  the  closely  related 
butterfly-exchange  network  (Fig.  3).  These  exchange  networks  play  central  roles  in  the  FFT 
algorithm  and  in  certain  approaches  to  sorting  problems.  Because  the  interconnections  are 
not  nearest-neighbor,  optical  interconnect  techniques  may  have  an  Important  role  to  play 
here.  Note  in  particular  that  the  butterfly  exchanges  could  be  carried  out  nicely  with  a 
dynamic  optical  interconnect  device  that  changes  its  Interconnect  pattern  at  the  end  of  each 
layer  of  exchanges. 

Lastly,  as  an  example  of  a  class  of  problems  with  even  greater  demands  on  intercon¬ 
nects,  we  mention  the  problem  of  space  variant  linear  filtering.  In  this  case  the  output 
array  g(n,m)  is  computed  from  the  input  array  f(n,m)  through  the  general  linear  filtering 
relation 


g(n,m)  -  25  h(n,m»k,p)  f(k,p)  (6) 
*P 

where  h(  )  is  the  kernel  of  the  operation.  In  the  most  general  case,  the  operation 
described  above  is  fully  global,  in  that  each  g(n,m)  receives  data  from  every  f(n,m).  The 
fact  that  the  transformation  is  space  invariant,  as  evidenced  by  the  dependence  of  the  ker¬ 
nel  on  joth  (n,m)  and  (k,p),  implies  that  the  interconnect  pattern  required  for  the 


computation  of  each  output  value  g(n,m)  changes  from  output  point  to  output  point.  Hence 
such  operations,  to  the  extent  that  they  are  fully  global  in  their  nature  and  to  the  extent 
that  they  require  a  multitude  of  different  interconnect  patterns,  place  an  extremely  inten¬ 
sive  burden  on  interconnect  technology,  and  may  ultimately  be  prime  targets  for  applications 
of  optical  interconnects. 

IV.  Optical  Interconnections 

This  Section  is  devoted  to  discussing  the  general  beneficial  properties  of  optics  as  a 
technology  for  providing  interconnections,  and  to  describing  some  of  the  approaches  that  can 
be  taken  to  using  optics  in  this  role.  Properties  of  optics  that  are  identified  as  being 
beneficial  are  done  so  with  more  conventional  integrated  circuit  interconnections  and  their 
limitations  in  mind. 

The  first  and  perhaps  most  important  advantage  of  optical  Interconnections  over  their 
electronic  counterparts  is  Immunity  to  mutual  interference  effects.  The  stray  capacitances 
that  exist  between  proximate  electrical  paths  introduce  cross-coupling  of  information  to  a 
degree  that  increases  with  the  bandwidth  of  the  signals  and  with  the  closeness  of  the  paths 
of  electrons.  In  contrast,  optical  interconnections  suffer  no  such  effects,  as  long  as  care 
is  exercised  to  assure  that  light  scattering  does  not  introduce  an  equivalent  result  (of 
different  origin).  Indeed,  two  streams  of  photons  can  pass  directly  through  one  another 
with  no  cross-coupling  of  their  high-frequency  modulations,  provided  the  propagation  medium 
is  linear,  as  will  nearly  always  be  the  case  unless  deliberate  steps  are  taken  to  introduce 
nonlinear  effects.  The  fundamental  differences  between  electrons  and  photons  in  this  regard 
can  probably  be  traced  to  the  fact  that  the  latter  are  bosons,  while  the  former  are  fer¬ 
mions.  Two  fermions  can  not  occupy  the  same  cell  of  phase  space,  whereas  two  bosons  can. 

A  second  advantage  of  optical  interconnections  is  their  freedom  from  capacitive  loading 
effects.  The  speed  of  propagation  of  an  electrical  signal  on  a  transmission'  line  depends  on 
the  capacitance  per  unit  length.  Thus  as  more  and  more  components  having  a  capacitive  com¬ 
ponent  of  admittance  are  attached  to  an  interconnection  line,  the  velocity  of  propagation 
decreases,  and  the  time  required  to  charge  the  line  to  a  predetermined  voltage  increases.  By 
way  of  contrast,  propagation  of  optical  signals,  whether  confined  to  waveguides  or  through 
free  space,  takes  place  at  a  speed  that  is  independent  of  the  number  of  components  that 
receive  those  signals,  namely  at  the  speed  of  light  in  the  medium  of  concern. 

A  third  potential  advantage  of  optical  interconnect  technology  lies  in  freedom  from 
planar  or  ouasi-planar  constraints .  Conventional  integrated  circuit  design  me tnodologies 
are  constrained  to  the  use  of  only  «  very  few  interconnecting  layers,  and  within  each  layer 
interconnections  can  not  cross.  Optical  waveguides  can  cross  through  other  optical 
waveguides  without  significant  cross-coupling  (provided  the  angle  of  intersection  is  greater 
than  about  10  degrees),  and  free-space  light  beams  can  cross  through  other  free-space  light 
beams  without  significant  interaction.  Such  flexibility  can  simplify  the  problems  associ¬ 
ated  with  routing  signals  on  a  complex  chip  or  board. 

A  fourth  advantage  to  be  considered  lies  in  the  potential  for  realization  of  repro¬ 
grammable  Interconnect  patterns  by  means  of  dynamic  optical  Interconnect  components.  In 
principle,  the  interconnection  patterns  associated  with  a  chip  can  be  changed  at  will  by 
writing  appropriate  information  to  the  interconnect  element.  While  such  a  capability  could 
be  realized  electronically  with  a  suitable  switching  network,  the  cost  in  terms  of  wiring 
area  is  likely  to  be  rather  large. 

Having  reviewed  the  basic  reasons  for  interest  in  optical  interconnect  technology,  we 
now  turn  attention  to  various  specific  forms  that  optical  interconnects  can  take.  In  this 
regard,  it  is  useful  to  distinguish  between  two  types  of  optical  interconnections.  The  term 
index-guided  interconnection  is  used  to  refer  to  an  optical  interconnection  established  when 
a  light  wave  is  guided  along  an  interconnect  path  by  means  of  a  structure  having  a  different 
refractive  index  than  the  surround.  The  usual  examples  are  optical  fibers  and  integrated- 
optic  waveguides.  The  term  free-space  Interconnection  is  used  to  refer  to  an  optical  inter¬ 
connection  established  when  a  light  wave  travels  from  source  to  detector  via  a  path  that 
consists  only  of  free  space  and  possibly  bulk  optical  components,  such  as  lenses,  mirrors, 
and  holographic  optical  elements.  Within  this  class  it  is  possible  to  distinguish  two  subc¬ 
lasses,  namely  unfocused  and  focused  interconnections.  For  unfocused  interconnects,  the 
optical  signals  are  simply  broadcast  over  a  comparatively  large  region,  whereas  for  focused 
interconnects,  the  signals  are  sent  to  specific  desired  locations  via  bulk  focusing  ele¬ 
ments. 

Index-guided  Interconnections  are  illustrated  in  Fig.  4,  which  shows  sources  connected 
to  detectors  via  both  fibers  and  integrated  optic  waveguides.  Consider  first  the  case  of 
interconnections  via  optical  fibers.  The  ends  of  the  fiber  must  be  carefully  positioned  over 
the  source  and  detector,  perhaps  with  the  help  of  a  wire-bonding  machine,  and  a  permanent 
attachment  muwt  be  made,  probably  with  CV-hardening  epoxy.  It  should  be  noted  that  fibers 
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can  not  be  allowed  to  bend  too  much,  for  bending  induces  radiation  losses,  which  can  be 
excessive  if  the  radius  of  curvature  of  the  bend  is  too  small.  Note  that  this  interconnect 
technology  requires  a  three-dimensional  volume  of  space,  for  fibers  must  be  crossed  above 
the  chip,  and  (for  surface  emitting  sources  as  shown)  bending  limitations  prevent  keeping 
the  fibers  too  close  to  the  plane  of  the  chip. 

For  integrated-optic  waveguide  interconnections,  the  waveguides  might  be  formed  by 
sputtering  glass  onto  a  silicon  dioxide  substrate.  These  guides  again  can  not  be  bent  too 
much,  due  to  the  problem  of  radiation  losses.  Light  can  be  coupled  out  of  a  guide  by  using 
a  taper  at  its  end,  causing  radiation  into  and  through  the  substrate.  Alternatively,  vari¬ 
ous  kinds  of  integrated-optic  junctions  and  couplers  can  be  used  to  route  signals  from  one 
source  to  a  multitude  of  detectors.  The  chief  difficulty  with  the  integrated-optic  approach 
lies  in  the  problems  of  efficient  coupling  into  and  out  of  the  waveguides. 

Figure  5  shows  two  approaches  to  free-space  unfocused  interconnections.  As  mentioned, 
this  approach  involves  broadcast  of  signals  from  one  or  more  sources  to  one  or  more  detec¬ 
tors,  with  no  focusing  of  the  light.  Figure  5a  shows  a  situation  appropriate  for  the 
transmission  of  a  single  data  stream  to  many  sites  on  the  same  chip,  as  required,  for  exam¬ 
ple,  for  clock  distribution.  Detectors  integrated  into  the  silicon  chip  receive  the  common 
signal  with  the  relative  delays  Inherent  in  the  path  length  differences  between  the  source 
and  the  detectors.  If  a  simple  positive  lens  is  inserted  between  the  source  and  the  chip 
such  that  the  source  lies  at  a  focal  point  of  the  lens,  then  all  paths  from  source  to  detec¬ 
tors  are  identical  in  length,  and  no  relative  delays  are  present.  Such  a  configuration 
might  be  useful  for  clock  distribution.  Figure  Sb  shows  a  reflective  configuration  that 
achieves  the  same  goal  as  the  previous  configuration  [19).  A  diffuser  situated  above  the 
chip  scatter:  any  optical  signal  transmitted  to  it  back  towards  the  chip,  where  it  can  be 
intercepted  by  any  number  of  detectors.  The  chief  disadvantages  of  all  unfocused  broadcast 
schemes  are  (1)  they  are  very  inefficient  in  that  only  a  small  fraction  of  the  total  optical 
power  falls  on  the  detector  sites  where  it  is  needed,  (2)  they  provide  basically  a  serial 
communication  channel,  with  any  parallelism  achievable  only  through  wavelength  multiplexing, 
a  complicating  factor  in  any  interconnect  realization,  and  (3)  they  require  a  three  dimen¬ 
sional  volume  rather  than  being  strictly  planar.  It  should  also  be  mentioned  that  the  broad¬ 
cast  of  optical  signals  to  the  entire  silicon  chip  has  the  potential  to  generate  signals  at 
unwanted  locations,  and  therefore  it  is  likely  that  an  opaque  dielectric  blocking  layer 
would  be  needed  to  prevent  the  optical  signals  from  reaching  undesired  locations. 

Figure  6  illustrates  three  approaches  to  free-space  focused  interconnects.  In  all  of 
these  cases,  the  light  is  sent  from  a  source  or  several  sources  only  to  those  locations 
where  detectors  are  located.  Figure  6a  is  appropriate,  for  example,  for  clock  distribution, 
in  which  a  single  signal  must  be  sent  to  many  sites  on  the  same  chip.  A  transmissive  holo¬ 
graphic  ootical  element  is  used  for  the  routing  of  the  signal.  Rather  high  overall  effi¬ 
ciencies  can  be  achieved  using  dichromated  gelatin  holographic  elements,  with  more  than  99« 
of  the  light  being  sent  to  the  desired  locations.  Note  that  any  element  that  focuses  the 
light  to  several  different  spots  will  also  Introduce  relative  time  delays  in  those  beams, 
but  for  typical  geometries  anticipated  (e.g.  chip,  source  and  hologram  confined  to  a  1  cm 
cube),  the  relative  delays  will  be  only  a  few  picoseconds.  Figure  6b  shows  a  more  general 
type  of  interconnection,  in  which  several  sources  on  the  left  are  connected  to  several 
detectors  on  the  right.  The  holographic  optical  element  used  in  transmission  connects  each 
source  to  any  number  of  detectors  in  whatever  pattern  might  be  desired.  Note  that  a  thin 
holographic  element  is  restricted  to  providing  the  same  interconnect  pattern  for  all  sources 
(i.e.  it  is  a  space-invariant  element).  However,  a  thick  holographic  element  can  provide 
different  interconnect  patterns  for  different  sources,  using  the  Bragg  effect  as  the  mechan¬ 
ism  of  selective  interconnection.  Finally,  Fig.  6c  shows  a  geometry  in  which  a  reflection 
hologram  is  used  as  the  routing  element.  The  free-spaced  focused  interconnections  described 
above  have  two  potential  disadvantages.  First,  and  most  serious,  they  require  very  precise 
alignment  of  the  focusing  element  with  respect  to  the  chip  or  chips.  Second,  as  with  the 
unfocused  free-space  interconnects,  since  the  focusing  element  lies  above  the  chip,  they 
require  a  three-dimensional  volume  in  order  to  establish  the  interconnect  pattern. 

He  will  close  this  section  with  a  consideration  of  the  Inefficiencies  Inherent  in  opti¬ 
cal  interconnect  schemes  by  virtue  of  the  fact  that  they  rely  on  the  conversion  of  electri¬ 
cal  signals  to  optical  signals  and  back  again  to  electrical  signals,  what  price  is  paid  for 
such  conversion?  An  answer  to  this  question  can  be  obtained  by  considering  a  "canonical" 
example.  It  is  reasonable  to  suppose  that  the  sources  used  are  LED's  or  lasing  diodes 
operating  in  the  near  infrared  with  a  wavelength  near  8000  A  where  the  responsivity  of  p-i-n 
silicon  detectors  is  near  its  peak  value  of  about  0.6  amps/watt.  It  is  not  unreasonable  to 
suppose  an  efficient  lasing  diode  using  1.5mw  of  drive  power  (1  ma  at  1.5  volts)  and  radiat¬ 
ing  0.75mw  of  output  optical  power.  If  optical  signals  of  O.Smw  power  are  available  at  the 
detector  end,  0.3ma  will  be  generated  by  a  p-i-n  photodiode.  A  transimpedance  amplifier  with 
perhaps  two  transistors  could  then  utilise  this  current  to  generate  voltages  in  the  lOOmv 
range.  Further  amplification  (a  few  transistors)  would  be  required  to  reach  voltages  compa¬ 
tible  with  logic  devices.  As  the  number  of  detectors  to  be  connected  to  a  single  source 
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increases,  the  power  available  at  each  detector  decreases,  thus  reouiring  more  amplification 
at  each  detector  site  (or  alternatively  more  transmitted  power  from  the  source). 


V.  The  Problem  of  Clock  Distribution 

A  problem  that  appears  amenable  to  immediate  attack  using  optical  technology  is  clock 
distribution  at  the  chip,  wafer,  or  board  level.  Most  (but  not  all)  computing  architectures 
require  synchronous  operation  of  a  multitude  of  devices,  circuits  and  subsystems.  Synchron¬ 
ism  is  maintained  by  distributing  to  all  parts  of  the  system  a  common  timing  signal,  called 
the  clock.  At  the  MOS  chip  level,  for  various  reasons  a  two-phase  clock  is  used.  However, 
if  a  single  phase  clock  is  distributed,  the  two  phases  can  be  generated  locally  with  a  rela¬ 
tively  simple  circuit  (20).  One  of  the  chief  difficulties  encountered  in  designing  circuits 
and  systems  for  high-speed  operation  is  "clock  skew",  a  term  which  refers  to  the  fact  that 
different  parts  of  the  circuit  or  system  receive  the  same  state  of  the  clock  at  different 
times.  In  this  section  we  consider  several  different  approaches  to  using  optics  for  distri¬ 
bution  of  the  clock,  with  the  aim  of  minimizing  or  eliminating  clock  skew.  Attention  is 
focused  primarily  on  clock  distribution  within  a  single  chip,  although  many  of  the  same  con¬ 
cepts  can  be  applied  at  the  wafer  level  or  even  the  board  level. 

The  interconnections  responsible  for  clock  distribution  are  characterized  by  the  fact 
that  they  must  convey  signals  to  all  parts  of  the  chip  and  to  many  devices.  These  require¬ 
ments  imply  long  interconnect  paths  and  high  capacitive  loading.  Hence  the  propagation 
delays  are  large  and  depend  on  the  particular  configuration  of  devices  on  the  chip.  Here  we 
consider  methods  for  using  optics  to  send  the  clock  to  many  parts  of  the  chip.  It  is  assumed 
that  optics  is  used  as  only  one  part  of  a  clock  distribution  hierarchy,  in  the  sense  that 
optical  signals  might  carry  the  clock  to  various  major  sites  on  the  chip,  from  which  the 
signals  would  be  distributed,  on  a  local  basis,  by  a  more  conventional  electronic  intercon¬ 
nection  system. 

If  optical  fibers  are  chosen  as  the  interconnect  technology,  then  the  following 
approach  might  be  used.  A  bundle  of  fibers  is  fused  together  at  one  end,  yielding  a  single 
core  into  which  light  from  the  modulated  source  (probably  a  diode  laser)  must  be  coupled. 
Light  coupled  in  at  the  fused  end  is  split  as  the  cores  separate,  and  is  transmitted  to  the 
ends  of  each  of  the  fibers  in  the  bundle.  Each  fiber  is  presumably  located  over  an 
integrated  optical  detector  which  will  convert  the  optical  signal  into  an  electrical  one. 
Alignment  of  the  fibers  and  the  detectors  can  be  accomplished  with  the  help  of  a  wire¬ 
bonding  machine,  and  uv-hardening  epoxy  can  be  used  to  fix  the  fiber  in  its  proper  place 
permanently.  Note  that  the  fibers  can  not  be  bent  too  much,  due  to  the  problem  of  radiation 
loss,  but  some  gradual  bends  will  be  inevitable  and  acceptable.  The  chief  difficulty  with 
this  approach  is  associated  with  the  alignment  problem. 

If  integrated  optical  waveguides  are  chosen  as  the  interconnect  technology,  then  the 
following  comments  apply.  Waveguides  might  be  formed  by  sputtering  glass  onto  a  silicon 
dioxide  substrate.  Probably  several  straight  waveguides  would  be  run  across  the  chip,  thus 
avoiding  problems  associated  with  bending  losses.  The  waveguides  would  all  be  fed  by  the 
same  optical  signal,  probably  generated  by  a  single  laser  diode,  with  distribution  to  the 
individual  waveguides  accomplished  by  fibers.  Alternatively,  separate  laser  diodes  could 
feed  each  waveguide,  with  synchronization  of  the  sources  provided  by  a  single  driving  signal 
distributed  electrically.  Light  must  be  coupled  out  of  each  waveguide  at  several  sites 
along  its  length,  with  a  detector  converting  the  optical  signal  to  electronic  form  at  each 
such  site.  The  primary  difficulties  associated  with  such  an  approach  are  the  problem  of 
efficient  coupling  of  the  light  into  the  waveguides,  and  the  problem  of  coupling  light  out 
with  small  coupling  structures.  Present  optical  waveguide  technology  requires  couplers  with 
sizes  rather  large  compared  with  the  feature  sizes  normally  used  in  electronic  integrated 
circuits. 

Clock  distribution  can  be  accomplished  with  free-space  unfocused  Interconnects  if  a 
single  optical  signal  is  simply  broadcast  to  the  entire  chip,  with  detectors  at  selected 
sites  generating  the  electrical  version  of  the  clock.  Most  desirable  in  terms  of  clock  skew 
would  be  a  single  optical  source  located  at  the  focal  point  of  a  positive  lens.  The  col¬ 
limated  light  falls  normally  on  the  surface  of  the  chip,  with  the  result  that  all  detectors 
receive  the  optical  clock  signal  with  no  skew  whatsoever.  However,  this  approach  is 
extremely  inefficient,  in  the  sense  that  most  of  the  light  falls  on  portions  of  the  chip 
where  it  is  not  needed  or  even  wanted,  and  hence  most  of  the  optical  energy  is  wasted.  This 
waste  has  important  implications  when  one  considers  that  the  amount  of  circuitry  required  on 
the  chip  to  bring  the  detected  clock  signals  up  to  a  usable  voltage  level  is  dependent  on 
the  strength  of  the  detected  signals,  which  in  turn  depends  on  the  amount  of  optical  power 
delivered  to  each  detector.  In  order  to  prevent  the  injection  of  unwanted  electrical  sig¬ 
nals  into  the  circuitry,  an  opaque  dielectric  overcoating  should  be  used  to  block  the  light 
at  all  points  except  those  where  a  detector  resides. 

Lastly  we  consider  clock  distribution  by  free-space  focused  interconnections.  For  such 
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interconnections,  the  optical  source  is  i ma?e4  by  an  optical  element  onto  a  irjltitud*  of 
detection  sites  simultaneously .  The  required  ootical  element  can  be  realised  by  rean:  of  a 
hologram,  which  acts  as  a  complex  grating  and  lens  to  senerate  focused  grating  components  at 
the  desired  locations.  The  optical  efficiency  of  the  ssner>e  can  far  exceed  that  of  the 
unfocused  method,  thereby  reducing  the  electrical  circuitry  required  to  bring  the  detected 
signals  up  to  logic-level  voltages.  Holographic  optical  elements  with  a  single  focal  point 
can  be  made  with  optical  efficiencies  well  in  excess  of  90t,  using  dichromated  gelatin  as  a 
recording  material.  For  elements  with  multiple  focal  points,  it  seems  certain  that  overall 
efficiencies  in  excess  of  SOt  can  be  achieved.  The  flexibility  of  tne  method  is  very  great, 
for  nearly  any  desired  configuration  of  connections  can  be  achieved.  The  chief  difficulty 
of  this  method  is  the  very  high  degree  of  alignment  precision  required  in  order  to  assure 
that  the  focused  spots  are  striking  the  appropriate  places  on  the  chip.  Of  course,  the 
spots  might  be  intentionally  defocused,  decrees!  the  efficiency  of  the  system  but  easing 
the  alignment  requirements.  There  exists  a  continuum  of  compromises  between  efficiency  and 
alignment  difficulty. 

Figure  7  illustrates  a  possible  configuration  that  retains  high  efficiency  but  minim¬ 
izes  alignment  problems.  The  imaging  operation  is  provided  by  a  two  element  microlens,  in 
the  form  of  a  block  with  a  gap  between  the  elements.  A  Fourier  hologram  can  be  inserted 
between  the  elements,  and  establishes  the  desired  pattern  of  focused  spots.  The  hologram 
itself  consists  of  a  series  of  simple  sinusoidal  gratings,  and  as  such  the  positions  of  the 
spots  generated  on  the  chip  are  invariant  under  simple  translations  of  the  hologram.  The 
source  is  permanently  fixed  to  the  top  of  the  upper  lens  block  after  it  has  been  aligned 
with  respect  to  an  alignment  detector  located  at  the  edge  of  the  chip,  thereby  establishing 
a  fixed  optical  axis.  The  only  alignment  required  for  the  hologram  is  rotation,  which  might 
be  aided  by  the  presence  of  a  few  more  alignment  detectors  on  the  chip.  The  positions  of 
the  image  spots  are  determined  by  the  spatial  frequencies  present  in  the  holographic  grat¬ 
ing.  Such  frequencies  could  be  established  very  precisely  if  the  hologram  were  written  by 
an  £-beam  lithography  machine. 

Clock  distribution  at  the  wafer  and  board  levels  is  also  a  strong  candidate  as  an 
application  of  optical  interconnections.  The  primary  differences  in  the  optical  require¬ 
ments  for  these  cases  lie  in  the  different  physical  sizes  of  chips,  wafers  and  boards.  At 
the  wafer  level,  the  discussions  of  the  previous  paragraphs  carry  over  essentially  without 
change.  At  the  board  level,  the  physical  areas  are  sufficiently  large  that  the  free-spaee 
and  integrated  optics  approaches  could  at  best  be  used  for  coverage  of  only  a  portion  of  an 
entire  board.  The  preferred  approach  seems  to  lie  with  optical  fibers  for  conducting  clock 
signals  to  remote  parts  of  a  single  board,  between  which  the  greatest  clock  skews  would  oth¬ 
erwise  be  anticipated.  Hierarchical  schemes  can  also  be  envisioned,  in  which  a  network  of 
fibers  distributes  an  optical  clock  to  a  series  of  widely  separated  sites,  from  which  either 
optical  waveguides  or  or  free  space  Interconnects  are  used  to  distribute  the  signals  more 
locally  to  detectors,  where  the  optical  signals  are  converted  to  electronic  form  and  distri¬ 
buted  on  an  even  more  local  scale  to  the  various  devices  that  require  the  clock  signal. 

VI.  Data  Interconnects  Oslnq  Optics 

The  use  of  optics  for  realising  data  paths  on  a  chip  is  inherently  a  more  difficult 
problem  than  the  clock  distribution  problem  discussed  above,  due  to  the  fact  that  there  are 
many  independent  sources  of  information,  each  requiring  an  optical  source.  In  addition,  the 
interconnection  patterns  required  by  different  sources  may  be  quite  different,  thus  requir¬ 
ing  an  extra  flexibility  of  the  interconnect  technology.  The  chief  practical  difficulties 
arise  from  the  fact  that  source  technology  is  GaAs  while  the  chip  technology  is  predom¬ 
inantly  Si.  He  consider  first  the  problems  of  Intrachip  communications,  and  later  make  some 
comments  on  interchip  communications. 

One  approach  to  the  problem  is  hybrid  in  nature.  GaAs  chips  containing  sources  are 
butted  against  a  Si  chip,  with  connections  made  by  wire  bonding.  Signals  to  be  sent  to  new 
locations  on  the  Si  chip  are  fed  to  the  perimeter  of  that  chip,  where  they  are  transferred 
to  the  GaAs  chips.  There  they  modulate  optical  sources,  which  send  optical  energy  back  to 
the  interior  of  the  81  chip,  where  it  is  detected  and  converted  to  electrical  signals.  Such 
a  scheme  is  illustrated  in  Fig.  8,  for  the  case  of  optical  signal  routing  via  a  holographic 
optical  element  above  the  chip.  A  similar  approach  could  use  sputtered  glass  waveguides  on 
an  SiO,  film  to  transport  the  optical  signals,  although  with  somewhat  less  flexibility  in 
terms  of  the  routing  patterns  that  can  be  achieved.  This  latter  scheme  might  be  well 
suited  to  the  realisation  of  a  perfect-shuffle  network. 

A  second  approach  to  the  problem  is  a  monolithic  one  that  presupposes  the  successful 
development  of  heteroepitaxial  growth  of  GaAs  on  Si,  presumably  with  the  help  of  a  buffer 
layer  (such  as  Ge)  for  lattice  matching.  The  hope  then  is  that  devices  can  be  fabricated  in 
both  the  81  and  the  GaAs.  Indeed  such  heteroepitaxial  growth  has  already  been  carried  out 
by  several  organizations.  However,  the  critical  question  concerns  the  defect  densities  In 
the  GaAs  and  the  corresponding  ability  to  realise  hlgh^uality  devices  in  that  material. 


So-ie  success  has  been  reported  in  this  resard,  w  th  defect  densities  ar  low  10*  per  cm2 
Ul) .  However,  as  yet  there  remains  a  considerable  distance  to  oo  before  devices  in  both 
materials  can  be  realised  in  monolithic  form. 

It  has  been  tacitly  assumed  in  the  preceding  discussions  that  the  ootical  sources  used 
must  be  on  or  close  to  the  Si  chip  containing  tne  computational  circuitry.  In  fact  the 
sources  could  be  quite  remote  from  the  circuitry,  provided  modulators  could  be  realised  on 
the  chip.  A  given  source  could  be  imaged  onto  the  on-chip  modulator,  where  the  modulating 
signal  would  be  imparted  to  the  light  beam.  At  present  the  only  serious  candidate  for  an 
on-chip  modulator  would  be  in  Integrated  optic  form  (22).  Such  devices  can  provide  modula¬ 
tions  in  the  Ghs  range,  but  the  difficulties  associated  with  coupling  the  externally  sup¬ 
plied  light  into  an  integrated  optic  waveguide  on  the  chip  will  be  non-trivial. 

If  holographic  optical  elements  are  to  be  used  for  data  routing,  the  necessity  to  pro¬ 
vide  different  geometrical  interconnect  patterns  for  different  light  beams  dictates  that  the 
elements  will  have  to  be  realised  in  the  form  of  thick  holograms.  For  such  structures,  the 
Bragg  effect  can  be  used  to  effectively  provide  a  separate  hologram  for  each  source,  and 
therefore  to  establish  a  different  and  unique  interconnect  pattern  for  each  data  path.  Such 
holograms  must  be  made  interferonetrically,  since  no  fully  synthetic  way  of  generating  the 
needed  thick  structures  is  known.  One  further  point  about  holographic  Interconnect  elements 
worth  mentioning  is  that  their  use  allows  the  possibility  of  "rewiring"  the  chip,  simply  by 
removing  one  holographic  optical  element  and  replacing  it  by  another  different  element. 
Ideally  one  would  like  to  have  an  electronically  addressable  light  valve  onto  which  a  holo¬ 
gram  can  be  written  for  the  particular  interconnect  pattern  desired  at  the  moment.  Such 
could  be  accomplished  today  if  a  thin  hologram  would  suffice,  l.e.  if  all  interconnect  pat¬ 
terns  had  the  same  geometrical  structure.  However,  how  to  construct  a  light  valve  for  real¬ 
izing  thick  holographic  optical  elements  is  problematical,  although  some  form  of  optical 
damage  in  electrooptic  crystals  might  provide  a  suitable  "real-time"  medium  for  writing 
thick  holograms. 

For  communication  at  the  interchip  level,  one  additional  approach  should  be  mentioned. 
Consider  an  array  of  silicon  chips  surrounding  a  GaAs  chip,  as  shown  in  Fig.  9.  Suppose 
that  information  is  to  be  transferred  from  this  cluster  of  chips  to  another  similar  cluster 
some  distance  away.  It  should  be  possible  to  realize  high-speed  multiplexers  and  demulti¬ 
plexers  in  the  GaAs  chips,  as  well  as  optical  sources,  very  wide  bandwidth  optical  signals 
can  then  be  transferred  between  chip  clusters  by  means  of  an  optical  fiber. 

VII.  Concluding  Remarks 

The  discussions  of  the  previous  sections  were  by  no  means  exhaustive,  for  the  filed  of 
optical  interconnections  is  so  young  that  we  have  only  scratched  the  surface  in  terms  of 
ideas.  However,  some  general  conclusions  are  possible,  and  we  list  some  in  what  follows. 

1.  The  problem  of  optical  clock  distribution  is  one  that  is  amenable  to  immediate  attack 
by  several  different  means.  It  provides  a  good  vehicle  for  learning  what  the  practical 
problems  will  be  when  optical  detectors  are  to  be  used  as  a  means  for  inputing  informa¬ 
tion  to  chips.  Furthermore,  the  problem  of  clock  skew  is  a  real  and  is^ortant  one,  and 
any  contributions  made  by  optics  to  solving  this  problem  will  be  welcome. 

2.  Hybridisation  of  Si  and  GaAs  chips  provides  the  short-term  solution  to  making  optical 
sources  available  to  silicon  circuitry.  Investment  in  the  development  of  such  tech¬ 
niques  will  lead  to  ability  to  experiment  with  optical  data  interconnections,  both  at 
the  interchip  and  lntraehlp  levels. 

3.  Heteroepltaxial  growth  of  structures  that  combine  Si  and  GaAs  layers  is  the  most 
attractive  long-term  approach  to  making  optical  sources  accessible  by  silicon  chips. 
The  development  of  such  techniques  to  the  point  where  Si  and  GaAs  devices  can  be  com¬ 
bined  in  monolithic  form  is  probably  a  necessary  condition  before  optical  interconnec¬ 
tions  can  have  major  impact  at  the  lntraehlp  level. 

4.  Integrated  optoelectronics  can  play  a  major  role  in  the  optical  interconnect  field. 
Developments  In  this  field  are  being  pushed  by  other  motivations,  but  optical  intercon¬ 
nections  can  take  direet  advantage  of  any  developments  in  this  field. 

5.  The  development  of  dynamic  masks  capable  of  changing  optical  interconnect  patterns  at 
high  speeds  would  provide  a  unique  capability,  and  could  potentially  speed  up  many 
Important  signal  processing  operations  that  depend  on  the  fast  Fourier  transform  algo¬ 
rithm,  for  example.  To  be  maximally  effective,  such  masks  should  be  analogous  to  thick 
holograms,  in  that  many  different  interconnect  patterns  can  be  multiplexed,  one  for 
each  of  many  sources. 


6. 


The  perfect-shuffle  exchange  is  a  sufficiently  ubiquitous  operation,  and  requires  suf¬ 
ficiently  large  wiring  area  in  conventional  integrated  circuit  form,  that  optical  real¬ 
izations  are  worth  pursuing.  Realizations  could  be  by  means  of  fibers,  waveguides,  or 
holographic  optical  elements. 

It  is  hoped  that  the  ideas  presented  above  will  serve  to  stimulate  further  thought  and 
research  in  this  interesting  area  of  technology. 
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Flqure  1.  Scaling  of  interconnect  lines. 


Figure  2.  Shuffle-exchange  network. 
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Coherent  optical  techniques  for  diagonaiization  and  inversion 
of  circulant  matrices  and  circulant  approximations  to 
Toeplitz  matrices 


Qizhi  Cao  and  Joseph  W.  Goodman 


A  coherent  optical  system  for  performing  continuous  Fourier  transforms  can  be  modified  to  perform  discrete 
Fourier  transforms.  Such  a  system  is  capable  of  diagonalizing  circulant  matrices  presented  at  its  input. 
The  diagonal  elements  of  the  new  matrix  are  the  eigenvalues  of  the  original  matrix.  A  suitable  modification 
allows  the  eigenvalues  of  many  different  circulant  matrices  to  be  found  simultaneously.  Such  a  technique 
can  be  used  for  the  initial  portion  of  a  coherent  optical  matrix  inversion  system,  which  ran  find  the  inverses 
of  circulant  matrices.  The  method  can  also  be  applied  to  the  problem  of  inverting  Toeplitz  matrices  in  a  hy¬ 
brid  digital  and  optical  system. 


I.  Introduction 

Matrix  operations  are  receiving  much  attention  from 
those  working  in  the  field  of  optical  information  pro¬ 
cessing.  Primary  attention  has  been  focused  on  two 
problems,  namely,  matrix-vector  multiplication  and  the 
solution  of  sets  of  simultaneous  linear  equations  by 
means  of  iterative  techniques.  In  this  paper  we  discuss 
potential  optical  solutions  to  another  class  of  problems, 
specifically  the  diagonaiization  and  inversion  of  circu¬ 
lant  matrices  and  circulant  approximations  to  Toeplitz 
matrices.  The  basic  ideas  underlying  the  method  were 
proposed  some  time  ago  but  have  just  recently  appeared 
in  print.’  Here  we  briefly  review  the  basic  ideas  in¬ 
volved  and  focus  attention  on  extensions  of  the  method 
to  matrices  with  complex  elements,  on  some  simple 
experiments  verifying  the  most  fundamental  aspects  of 
the  ideas,  and  on  the  implications  of  the  method  for  the 
problem  of  inverting  Toeplitz  matrices. 

A  circulant  matrix  is  one  for  which  each  successive 
row  is  a  simple  circular  shift  of  the  row  above  by  a  single 
element  For  example,  in  matrix  C  below,  the  numbers 
1-4  stand  for  four  distinct  elements,  and  the  organiza¬ 
tion  of  those  elements  in  the  circulant  matrix  is  as  fol¬ 
lows: 
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A  Toeplitz  matrix  is  one  for  which  all  elements  along 
the  diagonal  or  any  subdiagonal  are  constant.  For  ex¬ 
ample,  again  letting  each  number  represent  a  distinct 
matrix  element,  a  Toeplitz  matrix  has  the  form 

“l  2  3  4“1 

5  12  3 

6  5  12 
_7  6  5  lJ 

If  only  M  of  the  2N  -  1  possible  diagonals  are  nonzero, 
the  matrix  is  called  a  banded  Toeplitz  matrix  with 
bandwidth  M . 

While  any  circulant  matrix  is  also  Toeplitz,  the  re¬ 
verse  is  not  true.  Any  covariance  matrix  of  a  wide-sense 
stationary  random  process  is  a  Toeplitz  matrix  but  in 
addition  has  Hermitian  symmetry,  implying  that  its 
eigenvalues  are  non-negative  and  real.  The  eigenvalues 
of  a  circulant  matrix  are  in  general  complex-valued. 

A  very  important  problem  in  much  of  signal  pro¬ 
cessing  is  the  inversion  of  Toeplitz  covariance  matrices. 
Such  inversions  are  required  for  the  realization  of  op¬ 
timum  least-mean-square-error  filters,  and  the  inver¬ 
sion  process  must  be  repeated  as  knowledge  of  the  co- 
variance  matrices  is  updated.  While  fast  algorithms  for 
such  inversions  have  been  devised,  the  issue  of  com¬ 
putational  speed  is  still  an  important  one,  and  any  help 
offered  by  optical  processing  would  be  welcome.  In¬ 
version  of  Toeplitz  matrices  is,  therefore,  the  ultimate 
motivation  for  this  work. 
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In  Sec.  II  we  discus>  the  problem  of  diagonalizing 
circulant  matrices  using  a  coherent  optical  system  and 
oiler  some  simple  experimental  results  verifying  the 
basic  idea.  Section  111  discusses  potential  methods  for 
inverting  circulant  matrices  based  on  the  diagonaliza- 
tion  method  of  Sec.  II.  Such  inversion  techniques  have 
not  yet  Iteen  experimentally  verified,  but  there  is  strong 
evidence  that  they  can  be  carried  out  in  practice. 
Section  IV  provides  a  method  to  discover  both  moduli 
and  phases  of  the  output  spots  based  on  the  intensity 
readings  from  a  square-law  detector.  Section  V  dis¬ 
cusses  the  application  of  the  proposed  technique  to  the 
inversion  of  a  Toeplitz  matrix  or  a  multitude  of  Toeplitz 
matrices  simultaneously.  A  combined  optical  and 
digital  hybrid  processor  is  required.  Finally,  Sec.  VI 
summarizes  the  conclusions  of  the  work  at  this  stage. 

II.  Diagonalization  of  Circulant  Matrices  Using 
Coherent  Optics 

A  remarkable  property  of  circulant  matrices  (not 
shared  by  Toeplitz  matrices)  is  that  they  are  diagonal¬ 
ized  by  the  discrete  Fourier  transform  (DFT).2  The 
resulting  diagonal  elements  are  the  complex  eigenvalues 
of  the  original  matrix.  Thus,  if  the  DFT  matrix  VV  is 
defined  bv  (again  illustrating  with  a  4  X  4  example) 
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where  w  =  exp(-t'27r/Ar),  we  have 
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cr 


where  the  various  A  are  the  eigenvalues  of  C. 


A.  Implementation  with  a  2-D  Matrix  Input 

Two  methods  for  diagonalizing  a  circulant  matrix 
using  coherent  optics  can  be  identified.  One  is  esthe- 
tically  pleasing  in  the  sense  that  the  entire  circulant 
matrix  appears  at  the  input  to  the  system  and  the  entire 
diagonalized  matrix  appears  at  the  output  of  the  system. 
The  second  method  is  potentially  more  powerful.  Only 
a  single  row  of  the  circulant  matrix  is  presented  at  the 
input,  and  all  eigenvalues  appear  in  a  single  row  of  the 
output  plane.  With  the  help  of  suitable  astigmatic 
optics,  the  second  method  can  be  used  to  find  the  ei¬ 
genvalues  of  many  different  circulant  matrices  in  par¬ 
allel,  while  the  first  method  operates  on  one  matrix  at 
a  time.  We  focus  our  attention  first  on  the  2-D  method 
and  later  discuss  the  1  -D  method.  Throughout  this  and 
the  next  sections  we  assume  that  the  elements  of  the 
circulant  matrices  of  interest  are  non-negative  and  real. 
Generalization  to  complex -valued  matrices  is  deferred 
to  a  later  section. 
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The  task  of  diagonalizing  a  circulant  matrix  using 
coherent  optics  can  be  performed  if  the  normal  con¬ 
tinuous  2-D  Fourier  transform,  so  easily  performed  by 
such  systems,  can  be  changed  to  a  discrete  Fourier 
transform.  Such  a  change  can  indeed  be  made.  With 
reference  to  Fig.  1,  the  matrix  to  be  diagonalized  is  en¬ 
tered  into  the  coherent  optical  system  as  an  array  of 
transmitting  cells  in  a  mask,  each  cell  representing  one 
element  of  the  circulant  matrix  (for  the  moment  we 
assume  that  such  elements  are  non-negative  and  real). 
The  matrix  is  repeated  at  least  3X3  times  in  the  hori¬ 
zontal  and  vertical  directions  causing  the  spectrum  to 
form  a  series  of  discrete  spots.  The  amplitude  of  each 
of  the  spots  represents  a  different  complex  eigenvalue 
of  the  original  matrix.  Measurement  of  the  intensities 
of  these  spots  by  discrete  elements  of  a  detector  array 
is  equivalent  to  measurement  of  the  squared  magni¬ 
tudes  of  the  eigenvalues  of  the  matrix.  If  the  full 
complex  values  are  desired,  interferometry*  or  hetero¬ 
dyne  detection  must  be  used  to  extract  both  amplitude 
and  phase  information. 

A  mathematical  description  of  the  process  can  be 
given  as  follows.  The  coherent  optical  processor  is  as¬ 
sumed  to  perform  a  Fourier  transform  of  the  field  g(x^y) 
at  its  input.  Thus, 

C UjA>  I  “  jru,y*  exp|-i2rit>x  +  lyy  l)d*dv.  (41 

The  input  transparency  representing  the  replicated 
matrix  can  be  represented  as  an  amplitude  transmit¬ 
tance: 
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where  X  is  the  interval  between  any  two  cells  along  both 
x  and  v  directions  in  the  transparency,  L  is  the  period 
of  replication,  N  is  the  number  of  elements  in  one  row 
or  column  of  the  matrix,  and  are  the  amplitude 
transmittance  of  matrix  elements.  In  Eq.  (3).  for  sim¬ 
plicity  we  have  neglected  the  finite  size  of  the  trans¬ 
parency  and  the  finite  size  of  the  cells  representing 
mat  rix  elements.  A  more  general  formula  will  be  given 
later. 

Substitution  of g(x,v)  into  Eq.  (4)  yields  a  continuous 
spectrum 
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Fig.  2.  Matrix  and  its  eigenvalues — 2-D  format:  (a)  replicated  4 
X  4  circulant  input  matrix;  (b)  output  for  the  matrix  of  (a).  The  X, 
are  the  eigenvalues  of  the  original  matrix. 


Fig.  3.  Experimentally  obtained  output  for  a  3  X  3  circulant  input 
matrix. 


Finally,  suppose  that  the  continuous  spectrum  is 
sampled  at  particular  points  vz  =  k/L,  vy  =  UL.  Then, 
with  the  definition  N  -  L/X,  the  observed  amplitudes 
can  be  expressed  as 


1  n-in-i  r  or  1 

G*/»  —  E  E  *p«exp  -7  —  (kp  +  lq)\. 

L*  pm 0  qm 0  \  N  J 

The  values  of  the  complex  spectrum  at  the  chosen 
points  are  seen  to  represent  the  DFT  coefficients  of  the 
original  matrix  (to  within  a  known  constant). 

When  the  size  a  of  a  cell  of  the  input  matrix  and  the 
size  w  of  the  whole  matrix  transparency  are  taken  into 
account, g(xy)  and  G(vx,vy)  become 
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Returning  to  the  optics  of  the  situation,  if  Fig.  2(a) 
represents  the  form  of  the  matrix  mask  at  the  input  to 
the  system,  Fig.  2(b)  represents  the  form  of  the  light 
amplitude  distribution  in  the  rear  focal  plane  of  the  lens 
of  Fig.  1.  Each  spot  of  light  in  the  focal  plane  is  repre¬ 
sented  by  a  small  square,  with  a  label  indicating  its 
complex  value.  The  complex  amplitudes  of  the  spots 
are  proportional  to  the  eigenvalues  of  the  original  cir- 
cuiant  matrix.  In  addition,  as  implied  by  Eqs.  (6)  and 
(8),  there  is  a  replication  of  these  spots  in  the  focal  plane. 
The  spots  also  are  attenuated  by  a  slowly  falling  enve¬ 
lope  arising  from  the  finite  size  of  the  cells  of  the  input 
matrix. 

There  is  one  subtlety  that  should  be  mentioned.  By 
using  the  2-D  forward  DFT,  we  have  in  effect  performed 
the  matrix  operation  WCW,  omitting  the  inverse  sign 
on  the  first  DFT  matrix  [compare  Eq.  (3)].  The  result 
is  only  a  90°  clockwise  rotation  of  the  diagonals  in  the 
eigenvalue  plane,  a  change  that  is  of  no  consequence 
provided  its  existence  is  recognized.  As  shown  in  Fig. 
2(b),  for  example,  the  resulting  eigenvalues  are  along  the 
diagonal  lines  in  the  second  and  fourth  quadrants  in¬ 
stead  of  in  the  first  and  third  quadrants. 

A  simple  experiment  has  been  performed  to  verify  the 
basic  ideas  presented  above.  A  circulant  matrix  of  the 
form 


was  used.  Figure  3  shows  the  distribution  of  light  in¬ 
tensity  present  in  the  focal  plane  of  the  lens,  each  spot 
corresponding  to  an  eigenvalue.  The  intensities  of 
these  spots  were  measured.  The  measured  numbers 
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Table  L  Measured  Output  Relative  Value* 


Ideal 

Measured  by  power 
meter  ( n  W) 

Values  of  sine 
envelope 

After  compensation  for 
sine  envelope 

Normalized 

IX, 

2 

0.64 

0.90 

0.71 

2.03 

ns 

1 

0.33 

0.94 

0.35 

1 

m 

1 

0.33 

0.94 

0.35 

1 

were  compensated  for  the  falloff  induced  by  the  finite 
size  of  the  input  matrix  elements,  and  square  roots  were 
taken,  yielding  estimates  of  the  magnitudes  of  the  ei¬ 
genvalues  of  the  matrix.  Table  I  compares  the  numbers 
deduced  from  measurement  with  the  exact  results  ob¬ 
tained  by  analytical  calculation.  For  this  simple  case 
the  accuracy  was  found  to  be  ~1.5%.  It  is  likely  that  the 
accuracy  obtainable  is  a  function  of  the  size  of  the  ma¬ 
trix,  and  greater  care  will  be  needed  to  obtain  high  ac¬ 
curacy  from  larger  matrices. 

B.  Implementation  with  a  1-0  Input 

Since  the  successive  rows  of  a  circulant  matrix  are 
circularly  shifted  versions  of  the  neighboring  rows,  such 
a  matrix  is  obviously  highly  redundant.  It  may  come 
as  no  surprise,  therefore,  to  learn  that  the  eigenvalues 
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Fif.  4.  Matrices  and  their  eigenvalue* — 1-D  format,  (a)  Optical 
system  for  finding  eigenvalue*  of  many  circulant  metric**.  Each  row 
of  the  output  matrix  contains  the  eigenvalue*  of  one  input  matrix,  fb) 
Experimentally  obtained  output  for  an  input  mask  having  many 
three-element  sequences  in  parallel. 


of  the  entire  circulant  matrix  can  be  obtained  from 
operations  on  a  single  row.  It  can  be  shown  that,  if  the 
top  row  of  the  matrix  in  Eq.  (1)  is  subjected  to  a  1-D 
DFT,  the  complex  DFT  elements  are,  in  fact,  the  ei¬ 
genvalues  of  the  original  circulant  matrix  (see  Ap¬ 
pendix). 

The  above  fact  suggests  that  only  one  row  of  the  cir¬ 
culant  matrix  need  be  entered  into  the  optical  system 
(with  some  replication  of  that  row  in  the  direction  of  its 
length)  and  only  a  1-D  Fourier  transform  need  be  per¬ 
formed.  Thus,  the  spherical  lens  of  Fig.  1  can  be  re¬ 
placed  by  a  cylindrical  lens  with  power  along  the  di¬ 
rection  of  the  row.  More  important,  by  use  of  a  suitable 
pair  of  spherical  and  cylindrical  lenses,  the  optical 
system  can  be  made  to  Fourier  transform  in  one  direc¬ 
tion  (along  the  rows)  and  image  in  the  other  direction 
(across  the  rows),  allowing  1-D  DFTs  to  be  performed 
independently  on  all  rows  simultaneously.  In  this 
fashion  the  eigenvalues  of  many  circulant  matrices  can 
be  found  in  parallel.  A  2-D  detector  array  is  then  re¬ 
quired  at  the  output  of  the  optical  system  if  the  squared 
magnitudes  of  all  eigenvalues  are  to  be  measured.  The 
optical  Bystem  is  illustrated  in  Fig.  4(a). 

The  idea  of  finding  the  eigenvalues  of  many  circulant 
matrices  in  parallel  as  described  above  was  tested  (in 
a  limited  way)  using  the  optical  system  of  Fig.  4(a)  and 
the  same  input  mask  that  led  to  the  results  of  Fig.  3. 
Originally  the  input  mask  was  regarded  as  representing 
a  full  circulant  matrix.  For  this  particular  mask,  each 
row  is  a  unit  circular  shift  of  the  row  above  it,  so  all  cir¬ 
culant  matrices  being  represented  by  the  rows  of  the 
mask  are  circularly  shifted  versions  of  the  same  matrix. 
The  magnitudes  of  the  eigenvalues  of  a  circulant  matrix 
are  unaffected  by  circular  shifts  of  the  rows  of  the  ma¬ 
trix.  Therefore,  all  rows  in  the  distribution  of  output 
intensity  in  the  DFT  plane  should  be  identical  for  this 
particular  mask.  Figure  4(b),  showing  the  experimental 
result,  demonstrates  that  this  is  the  case.  With  a  more 
general  set  of  circulant  matrices  represented  by  differ¬ 
ent  rows  of  the  input  mask,  the  various  rows  in  the 
output  plane  would  differ  from  one  another. 

C.  Diagonalizing  Circulant  Matrices  with  Complex- 
Valued  Elements 

Our  previous  discussions  have  all  assumed  implicitly 
that  the  elements  of  the  circulant  matrix  or  matrices  to 
be  diagonalized  are  non-negative  and  real;  that  is,  it  has 
been  assumed  that  they  can  be  represented  physically 
by  a  non-negative  and  real  amplitude  transmittance  of 
a  mask.  In  practice,  it  is  important  to  be  able  to  deal 
with  bipolar  real  inputs  and  in  some  cases  complex¬ 
valued  inputs.  In  this  section  we  discuss  several  ways 
for  generalizing  the  nature  of  the  inputs. 
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First,  we  suggest  a  way  to  input  a  2-D  complex  matrix 
having  positive  real  and  imaginary  parts  for  a  2-D  op¬ 
tical  Fourier  transform  system.  From  Fig.  2(b)  we  see 
that  the  output  spots  of  a  circulant  matrix  mask  are  all 
arranged  along  the  diagonals.  If  we  shift  the  entire 
input  along  the  direction  perpendicular  to  those  diag¬ 
onals,  according  to  the  shift  theorem,  we  will  have  an 
output  which  is  the  same  as  the  original  one,  except  for 
a  linear  phase  factor  along  the  shift  direction;  that  is, 
along  each  diagonal  of  the  output,  the  complex  field  of 
each  spot  on  this  diagonal  has  an  additional  constant 
phase  factor.  Controlling  the  shift  by  various  amounts, 
we  can  obtain  various  phase  factors  at  will. 

For  example,  as  shown  in  Fig.  5(a),  suppose  we  shift 
the  matrix  input  along  the  diagonal  toward  the  lower 
left  with  a  displacement  of  lU  of  the  diagonal  interval 
between  two  adjacent  cells  of  the  input.  The  Fourier 
transform  of  this  shifted  matrix  [the  shaded  one  in  Fig. 
5(a)]  has  an  additional  phase  factor  0  *  2 r(vx  +  vy)  • 
(X/4).  In  the  output  plane  [see  Fig.  5(b)]  for  the  di¬ 
agonals  which  satisfy 
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Fif.  5.  Two-dimensional  encoding  of  a  matrix  with  complex  ele¬ 
ments.  (a)  An  input  mask.  The  original  function  kg(xy) m  /f(xo') 
+  The  input  function  is  t'(xj)  m  fl(xj)  +  I\x  +  [X/4)j 

+  (X/4)].  (b)  Output  for  the  input  of  (a).  If  (u,  +  vy)  ■  (4n  +  1)/X 
with  n  •  0,±1,±2, ....  GHv,#y)  •  R(v,#y)  +  jl(v,juy). 
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Fig.  6.  Two-dimensional  encoding  of  a  circulant  matrix  with  bipolar 
and  complex  elements.  The  four  squares  frame  four  matrices,  each 
with  relative  shift  of  X/4  along  the  diagonal. 


<r,  +  vy)  *  ...  - 
with  n  ■  0,±l,dfc2, . . . 


7  3  5  4n  +  1 

x’x’x’x .  X 

,  the  phase  factors  are 


-7t  -3t  x  5t  4n  -f  1 

2  ’  2  ’i’  2 .  2 

with  n  ■  0,±1,±2, -  Therefore,  by  making  a  mask 

with  twin  structures  as  shown  in  Fig.  5(a),  with  the 
shaded  one  representing  the  imaginary  part  and  the 
unshaded  the  real  part,  we  generate  the  effect  of  a 
complex  input  having  positive  real  and  imaginary 
parts. 

In  the  same  way,  the  effect  of  a  complex  matrix  mask 
with  negative  real  and  imaginary  parts  can  be  gener¬ 
ated. 

Suppose  the  original  matrix  is  A  -  R  +  jl,  and  both 
R  and  I  have  bipolar  components.  They  can  be  de¬ 
composed  as  follows:  R  =  Cr  -  Br  and  /  ■  C,  -  B,  , 
where  both  B,  and  B,  are  bias  matrices  with  positive 
constant  elements  equal  to  the  magnitude  of  the  most 
negative  elements  from  R  and  I,  respectively.  Then  Cr 
and  C,  have  only  real  and  positive  elements. 

Noticing  that  a  negative  sign  is  equivalent  to  a  t 
phase  factor,  we  can  design  an  input  mask  for  the  matrix 
A  as  shown  in  Fig.  6.  As  before,  the  desired  eigenvalues 
are  located  along  the  output  diagonals  as  described 
earlier. 

For  the  case  of  many  1-D  inputs  and  a  spherical- 
cylindrical  lens  system,  we  now  suggest  a  way  to  gen¬ 
erate  a  bipolar  real-valued  input  Assume  that  the 
three  independent  (real  and  bipolar)  components  of  a 
3X3  circulant  matrix  are  |ci,C2,c3(.  By  addition  of  this 
vector  to  a  bias  vector  | b,b,b)>  we  have  a  new  vector 
|c  i»c  2,c  3}  with  all  positive  elements.  Therefore,  the  1-D 
Fourier  transform  for  |ci,C2,C3)  can  be  written  as 


and  correspondingly 
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Now,  as  in  the  2-D  case,  the  original  sequence  can  be 
represented  by  two  sequences,  one  of  which  is  shifted 
from  the  other  appropriately.  Knowing  that  the  Fou¬ 
rier  transform  of  the  bias  sequence  has  only  one  nonzero 
element,  we  can,  for  example,  shift  it  from  the  other  with 
a  distance  of  X/2  [Fig.  7(a)]  so  that  a  ir  phase  factor 
appears  at  the  first  spot  of  the  second  period  of  its 
Fourier  transform,  and  so  the  sum  of  the  two  Fourier 
transforms  within  this  period  turns  out  to  be  the  Fourier 
transform  of  the  original  bipolar  sequence.  This  ap¬ 
proach  is  illustrated  in  the  diagrams  of  Fig.  7. 

Furthermore,  in  Sec.  Ill  it  will  be  seen  that  a  second 
DFT  is  required  to  perform  inversion  of  a  circulant 
matrix.  Therefore,  we  should  have  at  least  three  con¬ 
secutive  periods  of  correct  eigenvalues  at  this  step  to 
assume  that,  in  the  final  output  plane,  the  elements  of 
the  inverse  matrix  appear  as  separated  spots.  Another 
input  geometry  is  designed  for  this  purpose  and  is  il¬ 
lustrated  in  Fig.  8. 


III.  Inversion  of  Circulant  Matrices 

While  the  problem  of  finding  eigenvalues  of  a  circu¬ 
lant  matrix  is  of  interest  in  its  own  right,  there  is  much 
greater  interest  in  inverting  such  matrices.  Such  a 
matrix  inversion  can  be  carried  out  if  a  means  of  in¬ 
verting  the  complex  eigenvalues  of  the  diagonalized 
form  of  the  matrix  can  be  found.  A  discrete  Fourier 
transform  followed  by  inversion  of  the  complex  eigen¬ 
values  followed  by  an  inverse  DFT  yields  a  new  circu¬ 
lant  matrix  that  is  the  inverse  of  the  original  matrix. 
Such  an  inversion  method  can  be  applied  to  a  single 
circulant  matrix  or  to  many  circulant  matrices  simul¬ 
taneously  using  the  astigmatic  processor  described 
earlier. 

The  key  question  to  be  addressed  here  is  how  to  invert 
the  complex  eigenvalues  of  the  matrix  or  matrices,  those 
eigenvalues  being  represented  by  complex-valued  fields 
in  the  DFT  plane  of  the  processor.  An  answer  to  this 
question  is  suggested  by  the  following  discussion. 

Suppose  we  have  a  holographic  recording  device  in 
the  DFT  plane  that  produces  a  diffracted  field  (into  the 
-1  or  conjugate  diffraction  order)  having  an  amplitude 
that  is  proportional  to  the  contrast  of  the  fringes  gen¬ 
erated  between  an  incident  uniform  plane  reference 
beam  and  the  object  beam  consisting  of  discrete  spots 
of  light  with  amplitudes  proportional  to  the  eigenvalues 
of  the  input  circulant  matrix.  If  Ur  and  U0  represent 
the  amplitudes  of  the  incident  reference  and  object 
beams,  respectively,  we  suppose  that  the  amplitude  l/,- 
of  the  reconstructed  image  wave  is  given  by 
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Fig.  7.  One-dimensional  encoding  of  a  sequence  with  bipolar  num¬ 
bers,  resulting  in  one  correct  period  in  the  output:  (a)  \a\M2Ji3\  is  a 
positive  sequence.  I6.fc.bl  is  a  bias  sequence;  (b)  I  -D  DFT  of  la,.a2.aj|; 
(c)  1-D  DFT  of  Ifc.fc.fcl;  (d)  1-D  DFTof  |o,^2^3|. 
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Fig.  8.  One-dimensional  encoding  of  a  sequence  with  bipolar  num¬ 
bers  resulting  in  three  correct  periods  of  the  output:  (a)  positive  se¬ 
quence  lo'itHstOsI  and  three  bias  sequences,  each  of  which  has  linear 
shifts  of  XU  or  3X/4  from  the  positive  sequence,  respectively,  (b)  1-D 
DFT,  of  four  sequences;  (c)  1-D  DFT  of  lai^ij^st 
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where  a  is  a  constant  If  we  now  further  assume  that 
the  reference  beam  is  chosen  to  be  much  weaker  than 
the  object  beam,  we  find  that  the  reconstructed  image 
wave  is  given  approximately  by 

l/, -«  77-  (10) 

Since  the  reference  wave  amplitude  is  constant  the 


reconstructed  image  wave  has  an  amplitude  propor¬ 
tional  to  the  reciprocal  of  the  complex  object  wave  U„. 
This  is  precisely  the  inversion  process  we  desired. 

That  photographic  film  can  be  made  to  behave  in  the 
manner  postulated  in  Eq.  (9)  is  well  established  in  optics 
literature.  Such  a  dependence  was  exploited  by  Rag- 
narsorv*  and  by  Tichenor4  to  realize  coherent  optical 
inverse  filters  using  bleached  photographic  emulsions. 
More  important,  a  similar  dependence  of  the  field  dif¬ 
fracted  in  four-wave  mixing  experiments  has  been  noted 
and  exploited  by  White  and  Yariv  for  image  edge  en¬ 
hancement.5  Most  recently,  Ja  has  demonstrated 
wave-front  division  using  four-wave  mixing.6  Thus, 
there  seems  to  be  ample  reason  to  believe  that  methods 
for  optical  inversion  of  complex  eigenvalues  can  be  re¬ 
alized  in  the  near  term,  particularly  using  real-time 
four-wave  mixing  devices. 

The  optical  setup  required  for  finding  inverses  of 
many  circulant  matrices  simultaneously  is  illustrated 
in  Fig.  9  under  the  assumption  that  a  suitable  nonlinear 
device  is  used  for  the  reciprocation  operation.  While 
the  reciprocation  and  inversion  processes  have  not  been 
experimentally  demonstrated  in  this  paper,  the  evi¬ 
dence  that  they  can  be  carried  out  at  least  for  some 
dynamic  range  of  eigenvalues  is  extremely  strong  based 
on  the  above  references. 
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Fig.  9.  System  for  performing  the  inversion  of  many  circulant 
matrices. 

In  the  following  we  will  demonstrate  this  property  for 
a  3  X  3  circulant  matrix.  The  proof  for  an  N  X  N  cir- 
culant  matrix  is  a  simple  extension  of  that  for  the  3  X 
3  case.  With  ci,  C2,  and  c3  as  three  independent  com¬ 
ponents,  the  circulant  matrix  could  be  represented  by 
vector  C, 


consisting  of  the  elements  in  the  first  row  of  C.  Simi¬ 
larly,  its  inverse  matrix  could  be  represented  by  a  vector 
C-1,  also  consisting  of  the  elements  of  the  first  row  of 

c->. 

As  we  know, 


IV.  Detection  of  Output  Data 

In  general  it  is  possible  for  the  elements  of  an  inverse 
matrix  to  be  bipolar  (for  real  original  matrices)  or 
complex  (for  complex  original  matrices).  However,  the 
coherent  optical  system  produces  at  its  output  a  dis¬ 
tribution  of  intensity  representing  the  squared  moduli 
of  the  elements  of  the  inverse  matrix.  Limiting  con¬ 
sideration  to  real-valued  input  matrices,  some  means 
for  determining  the  sign  of  a  real  output  must  be 
found. 

One  possible  solution  is  to  resort  to  interferometric 
detection  at  the  output,  from  which  information  con¬ 
cerning  the  signs  associated  with  the  various  matrix 
elements  can  be  found.  Such  an  approach  has  the 
disadvantage  of  significantly  complicating  the  optical 
system.  We  consider  here  an  alternative  approach 
which  seems  considerably  simpler  to  implement.  Our 
approach  rests  on  a  certain  shift  property  of  a  circulant 
matrix  as  we  now  discuss  in  detail. 

A.  Shift  Property  of  a  Circulant  Matrix 

We  first  define  a  shift  of  a  matrix  to  be  the  addition 
of  a  complex  constant  to  each  element  of  the  matrix. 
The  shift  property  of  a  circulant  matrix  can  be  stated 
as  follows:  if  a  circulant  matrix  C  is  shifted  to  C, ,  the 
inverse  matrix  [C,]~ 1  is  also  shifted  from  C-1  with  an¬ 
other  shift  constant  In  other  words,  if  C,  ■  C  +  S,  then 
[C,]-1  *  C_1  +  E,  where  S  and  E  are  matrices  contain¬ 
ing  constant  elements. 

Furthermore,  it  can  be  shown  that,  if  all  the  elements 
of  S  are  s,  the  elements  of  E  are  all  equal  to  Ns/(Xj(X| 
+  Ms)],  where  N  and  Xi  are  the  dimension  and  the  first 
eigenvalue  of  matrix  C,  respectively. 
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Now  if  C  is  shifted  by  vector  S, 
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with  C,  *  C  +  S,  then,  from  Eqs.  (11)  and  (13)  and  by 
recalling  the  DFT  of  S,  a  vector  with  constant  elements, 
has  only  one  nonzero  element  (the  dc  component)  3s, 
we  obtain 


"X|  +  3* 
C.  -  X* 

.  X, 


(14) 


Finally,  from  Eqs.  (12)  and  (14), 
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that  is,  C71  ”  C-1  -  E,  where  the  second  shifting  vector 
is 
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B.  Implementation  of  Sign  Retrieval 

With  the  help  of  the  shift  property  of  a  circulant 
matrix  we  now  can  retrieve  the  sign  information  asso¬ 
ciated  with  the  elements  of  the  inverse  matrix.  When 
an  input  is  designed,  we  interleave  N  additional  rows 
between  the  N  original  rows  of  the  input  Each  of  the 
additional  rows  represents  a  shifted  version  of  one  of  the 
original  matrices.  Therefore,  an  input  representing  N 
matrices  is  now  expanded  to  2 N  rows;  i.e.,  N  pairs  of 
rows.  Similarly,  at  the  output  plane,  we  will  also  have 
N  row-pairs.  Each  pair  represents  two  respective  in¬ 
verse  matrices,  one  of  which  is  shifted  from  the  other. 
For  example,  if  the  elements  of  one  row  are  |a,6,c, . .  |, 
those  of  the  other  are  |a  -  e,b  -  efi  -  e, . .  .J,  where  e  ■ 
Afc/lAjfA]  +  Ns)]  with  all  letters  having  the  same 
meanings  as  in  Sec.  IV.A. 

e  can  be  easily  calculated  because  A]  is  simply  the  sum 
of  all  the  elements  of  the  original  matrix  and  s  is  the 
shift  constant,  which  can  be  any  appropriate  real 
number  (for  the  1-D  input  case).  When  Aj  «  0,  e  will 
be  infinite.  However,  this  implies  that  the  original 
circulant  matrix  is  singular,  which  is  a  case  we  are  ex¬ 
cluding  here. 

With  a  square-law  detector  the  values  of  | a2J>2*2,  ■  •  I 
and  |(a  -  e)2,(b  -  e)2,(c  -  e)2, . .  .|  can  be  measured. 
Thus,  the  remaining  problem  of  finding  the  values  of 
|a,6,c, . .  .|  is  quite  simple  and  can  be  solved  rapidly  with 
a  digital  computer. 


V.  Inversion  of  Toeplitz  Matrices  using  Circulant 
Approximations 

Until  this  point  we  have  considered  the  inversion  of 
only  circulant  matrices.  Using  approximation  methods 
developed  by  Jain,7  it  is  also  possible  to  invert  banded 
Toeplitz  matrices  using  the  same  techniques  supple¬ 
mented  by  some  digital  processing.  The  resulting  hy¬ 
brid  processor  can  be  envisioned  for  use  in  inverting  a 
multitude  of  Toeplitz  matrices  simultaneously.  The 
digitial  processing  required  can  be  divided  so  that  highly 
parallel  digital  hardware  can  be  used  with  each  separate 
digital  processor  devoted  to  inverting  a  small  Toeplitz 
matrix,  while  the  optical  processor  inverts  a  large  cir¬ 
culant  matrix.  In  what  follows  we  describe  the  ideas 
that  lie  behind  such  an  approach. 

According  to  Jain,7  a  Toeplitz  matrix  can  be  extended 
with  terms  so  that  it  becomes  a  circulant  matrix.  This 
technique  is  called  circulant  decomposition.  For  ex¬ 
ample,  the  circulant  matrix  H,  is  extended  from  a 
Toeplitz  matrix  H  as  shown  below: 
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where  Bn.B12.B21,  and  B22  are  four  block  matrices,  each 
of  which  has  the  size  shown. 

It  has  been  proved7  that  the  relation  between  the 
inverse  of  the  original  Toeplitz  matrix  T  and  the  matrix 
He  is 

H*>  -  Bn  -  BitBtt-'Bn.  (15) 

Since  B22  is  still  a  Toeplitz  matrix  but  with  a  smaller 
size  of  k ,  the  advantage  of  decomposition  now  can  be 
seen. 

Apart  from  the  algorithm  for  inverting  a  Toeplitz 
matrix  without  decomposition,  this  method  includes 
four  steps:  first,  extension  of  the  banded  Toeplitz 
matrix  to  a  circulant  one;  second,  inverting  the  circulant 
matrix  by  performing  a  forward  FFT,  inverting  s  diag¬ 
onal  matrix  and  performing  an  inverse  FFT;  third,  in¬ 
verting  a  Toeplitz  matrix  of  small  size,  which  is  em¬ 
bedded  in  the  inversion  of  the  circtilant  matrix.  This 
latter  inversion  is  accomplished  with  a  conventional  fast 
algorithm.  Finally,  the  matrix  multiplication  shown 
in  Eq.  (15)  must  be  performed.  Table  II  shows  the 
numbers  of  computations  required  by  two  algorithms. 
One  is  the  well-known  Trench  algorithm,8  the  fastest 
conventional  algorithm  for  inverting  a  Toeplitz  matrix, 
and  the  other  is  the  algorithm  with  decomposition  as 
described  above.7 

In  Table  II,  n/  is  the  number  of  operations  involved 
in  the  FFTs  for  finding  the  inverse  of  the  circulant 
matrix;  i.e.,  rifm(N  +  k)  k>g<N  +  A).  It  can  be  seen  that 
the  computations  required  by  the  decomposition 


v  **  V 


methods  are  much  less  than  those  for  the  Trench  algo¬ 
rithm  when  k  «  A'  (a  banded  matrix  case).  In  addition, 
when  A  *  logoJV  so  that  2 n/  »  3A-  +  0(N)  as  in  practice, 
the  computations  for  inverting  the  circulant  matrix 
become  the  greatest.  For  example,  if  N  -  512  and  k  - 
10.  then  2rif  =»  9000  and  3k-  +  0 (N)  *  1000.  This  result 
supports  the  idea  of  using  an  optical  and  digital  hybrid 
system  to  accomplish  the  inversion  of  a  banded  Toeplitz 
matrix  with  the  optical  system  taking  care  of  the  large 
circulant  matrix  and  the  electronic  digital  computer 
performing  the  remaining  operations. 


VI.  Concluding  Remarks 

In  this  paper  we  have  discussed  methods  by  which 
coherent  optical  systems  can  be  used  to  diagonalize  and 
invert  large  circulant  matrices.  One  configuration  al¬ 
lows  the  simultaneous  inversion  of  many  such  matrices 
in  parallel.  The  methods  can  also  be  extended  to  the 
problem  of  inverting  a  multitude  of  banded  Toeplitz 
matrices.  The  simultaneous  inversion  of  many  such 
matrices  may  be  a  useful  capability  in  the  processing  of 
data  from  arrays  of  sensors.  Often  the  data  from  such 
arrays  are  broken  into  many  spectral  bands  with  a 
separate  matrix  inversion  operation  required  for  the 
covariance  matrices  appropriate  for  each  spectral  band. 
Such  operations  are  computationally  very  intensive,  and 
the  availability  of  a  fast  parallel,  circulant  matrix  in¬ 
version  system  could  speed  the  computations  consid¬ 
erably. 
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Appendix 

From  W-‘CW  -  A,  we  have  (W-'CW]  r  «  a,  where 
T  denotes  the  transpose  of  a  matrix.  Therefore,  W  TC  T 
-AWT  But  WT  -  W,  then  WC T  «  AW. 

Writing  out  the  first  column  of  both  sides  of  the  above 
equality,  we  have  WC  -  A,  where 
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